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(57) Abstract: The invention provides methods for inhibiting stem cell differentiation and for increasing the effective dose of stem 
cells in a subject. HSC differentiation can be inhibited by applying an HSC differentiation-inhibiting polypeptide identified in the 
present invention to an HSC culture in vitro, or administering the polypeptide to a subject in vivo. Some other methods of the 
invention comprise first obtaining a population of hematopoietic stem cells, introducing into the cells an HSC differentiation-inhibit- 
ing polynucleotide disclosed herein, and expressing the HSC differentiation-inhibiting polynucleotide in the cells. Such genetically 
modified stem cells can be administered to a subject whereby effective dose of the stem cells in the subject can be increased. This 
invention further provides novel molecular markers of hematopoietic stem cells, and methods for enriching hematopoietic stem cells 
using these novel markers. 
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METHODS AND COMPOSITIONS FOR MODULATING STEM CELLS 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims the benefit of priority to U.S. Provisional Patent 
Application Serial No. 60/447,030 (filed February 12, 2003), the disclosure of which is 
incorporated herein by reference in its entirety and for all purposes. 

FIELD OF THF. INVENTION 
The present invention generally relates to methods for enriching stem cell 
population and for modulating stem cell differentiation, as well as to therapeutic applications 
of such methods. More particularly, the invention pertains to genes differentially expressed 
in hematopoietic stem cells and to methods of using these genes to modulate stem cell 
differentiation. 



a rynROTIND OF THE INVENTION 

Hematopoiesis (hemopoiesis) is a process whereby multi-potent stem cells 
give rise to lineage-restricted progeny. The molecular basis of hematopoiesis remains poorly 
understood. Hematopoietic stem cells (HSCs) are the only cells in the hematopoietic system 
that produce other stem cells and give rise to the entire range of blood and immune system 
cells. These cells are able to self-proliferate, so as to maintain a continuous source of 
regenerative cells. When subject to particular environments and/or factors, they can 
differentiate to dedicated progenitor cells, where the dedicated progenitor cells may serve as 
the ancestor cell to a limited number of blood cell types. 

HSCs and their progenies at the various development stages all play an 
important role in the normal function of the mammalian immune system. HSCs are of 
prominent therapeutic importance in many circumstances. In many diseased states, the 
disease is a result of some defect in the maturation process. In other situations, such as 
transplantation, there is a need to prevent the immune system from rejecting the transplant 
by irradiating the host. In neoplasia, a patient may be irradiated and/or treated with 
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chemotherapeutic agents to destroy the neoplastic tissue, which often also damage or destroy 
the host immune system. Further, other situations such as a severe insult to the immune 
system also result in a substantial reduction in stem cells and injury to the immune system. 
In all these situations, it will frequently be desirable to restore stem cells to the host. For 
example, HSCs are the active component in bone marrow transplantation (BMT), and 
transplant of highly purified HSC will completely restore the hematopoietic system in a 
manner indistinguishable from unfractioned bone marrow. 

Despite decades of research, there are currently no satisfactory methods to 
expand the numbers of HSCs or accurately enumerate the numbers of expanded and 
engraftable HSCs cells following in vitro culture. There is a need in the art for better 
methods for isolating, enriching, and enumerating transplantable HSCs. The instant 
invention fulfills this and other needs. 

SUMMARY OF THE INVENTION 

In one aspect, the invention provides methods for inhibiting differentiation of 
mammalian stem cells. The methods entail (a) providing a population of stem cells, (b) 
introducing a vector comprising an HSC differentiation-inhibiting polynucleotide of the 
present invention into the stem cells, and (c) expressing.a polypeptide encoded by the 
polynucleotide by culturing the modified stem cells, thereby inhibiting differentiation of the 
stem cells. In some of the methods, the stem cells are isolated from bone marrow. In some 
preferred methods, the stem cells are human hematopoietic stem cells. The human stem cells 
can be first selected for expression of CD38 and Thy prior to introduction of the vector. In 
some of the methods, the HSC differentiation-inhibiting polynucleotide encodes GATA- 
binding protein 3 or ID3 . 

In a related aspect, the invention provides methods for increasing the 
effective dose of hematopoietic stem cells in a mammalian subject. The methods require (a) 
providing a population of hematopoietic stem cells, (b) introducing into the cells an HSC 
differentiation-inhibiting polynucleotide of the present invention, and c) administering the 
genetically modified cells that express an HSC differentiation-inhibiting polypeptide to a 
mammalian subject; thereby increasing the effective dose of hematopoietic stem cells in the 
subject. In some of these methods, the administered stem cells are a subpopulation of the 
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modified cells that are selected for expression of the polypeptide prior to administering to 
the subject, in somepreferred methods, the subject is human, and the hematopoietic stem 
cells are human hematopoietic stem cells. In these methods, the hematopoietic stem cells 
can be selected for expression of CD34 and Thy prior to introducing into the cells the HSC 
differentiation-inhibiting polynucleotide. 

In another related aspect, the present invention provides methods for 
inhibitmghematopoiencstemcelldifferentiatio^ 

polypeptide identified by the present inventor. The methods entail contacting a populate 
of HSCs with an effective amount of the HSC differentiation-inhibiting polypeptide which 
inmbitsdiffe^^ 

vitro cell culture. In some other methods, the HSCs are present in a subject grafted wuh the 
HSCs In some preferred methods, the subject is human. 

In another aspect, the invention provides methods for isolating a population 
of cells that are enriched for hematopoietic stem cells (HSCs). These methods comprise (a) 
obtaining a sample of cells containing hematopoietic stem cells, (b) selecting cells from the 
samplebased on expression or lack of expression of at least oneknownHSC surface marker, 
and at least one novel HSC molecule marker identified in the present invention, and (c) 
separating cells with the known HSC marker and at least one of the novel molecule markers; 
thereby isolating a population of human cells enriched for hematopoietic stem cells. 

Preferably, the hematopoietic stem cells enriched with these methods are 
humanHSCs. In some methods, the known human HSC marker is CD34 + and Thy + . In 
some of the methods, the at least one novel HSC marker is a human HSC surface molecule 

identified in the present invention. 

In another aspect, the invention provides methods for enumerating 
hematopoieticstemcellsinapopulationofcells. The methods entail (a) contacting the 
population of cells with an antibody that specifically binds to one novel HSC surface marker 
identified in the present invention under conditions that allow the antibody to specifically 
bind to the HSC surface marker, and (b) quantifying the cells recognized by the antibody; 
therebyenumeratinghematopoieticstem^llsinthepopulationofcells. Insomeof these 
methods, the hematopoietic stem cells are human HSCs, and the population of cells are first 
selected for expression of CD34 and Thy prior to the contacting. 
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A further understanding of the nature and advantages of the present invention 
may be realized by reference to the remaining portions of the specification and claims. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows schematic structure of expression vectors for overexpressing 
various HSC differentiation-inhibiting genes. 

Figure 2 shows that ID3 over-expression increases the number of colony 
forming cells in CFC assay. 

Figure 3 shows upregulated expression of various transcription factors in 

mouse HSCs. 

DETAILED DESCRIPTION 

I. Overview 

The present invention is predicated in part on the discovery by the present 
inventor that a number of genes are differentially expressed in hematopoietic stem cell 
populations (see Examples below). It was also found that some of these HSC genes slow 
down HSC differentiation or enhance HSC activities when they are overexpressed in HSCs. 
These genes are therefore termed HSC differentiation-inhibiting genes. 

Using HSCs enriched from blood of normal human donors, it was found that 
sequences upregulated in the human HSCs include genes encoding hormones, enzymes, 
histone, transcription factors, secreted proteins, surface markers, and other molecules. Table 
1 lists examples of these genes that are upregulated in human HSCs (CD4+Thy+) as 
compared to non stem cells (CD4+Thy-). Further, using HSCs isolated from two different 
sources, bone marrow and peripheral blood, the present inventor identified a set of genes that 
are differentially expressed in HSCs from both sources. Some of these genes are shown in 
Table 2. 

Similarly, in a mouse HSC population (CD34-CD38+), a number of genes 
encoding proteins with diverse biochemical and cellular functions were also upregulated, 
including genes encoding surface antigens, transcription factors or growth factors (see 
Tables 3 and 4). These novel HSC genes are enriched in HSCs compared to their 
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differentiated progeny (e.g., CD34+ CD38+ progenitor cells) or CD34+CD38- facilitator 
cells. 

Without being bound in theory, the molecules upregulated in HSCs could 
play various functions in modulating HSC growth and differentiation, as well as regulating 
activities and functions of progenitor cells that differentiated from the HSCs. For example, 
increased levels of some of the surface receptors, growth factors, and secreted proteins 
shown in Table 2 could act in synergy in inhibiting HSC differentiation and promoting their 
expansion. 

In accordance with these discoveries, the present invention provides methods 
for modulating HSC differentiation. Inhibition of HSC differentiation allows continued 
growth and expansion of the HSC population, and therefore provide engraftable HSCs with 
increased dosage and higher potency. A number of the upregulated HSC genes identified 
herein (e.g., shown in Tables 1 , 3, and 4) can potentially function as HSC differentiation- 
inhibitors. For example, polypeptides encoded by the novel HSC genes disclosed herein 
(e.g., the growth factors or hormones shown in Table 2) can be used to inhibit HSC 
differentiation in vitro (e.g., by applying to an HSC cell culture) and in vivo (e.g., by 
administering to a subject engrafted with bone marrow or HSCs). Differentiation inhibiting 
activities of these molecules were exemplified by GATA3 and ID3 as shown in the 
Examples below. 

As indicated by the GenBank accession numbers or other identification 
numbers or descriptions in Tables 1, 3, and 4, sequences of the upregulated human and 
mouse HSC genes disclosed herein are all known in the art. Thus, as detailed below, the 
HSC differentiation-inhibiting polynucleotide sequences can be easily obtained 
commercially, from the sources disclosed in the public databases, or isolated using routine 
techniques of molecular biology. The encoded polypeptides can also be obtained 
commercially or easily produced with standard procedures of recombinant techniques. 

The invention also provides methods for isolating and enriching HSCs. The 
currently known HSC markers are not satisfactory because they cannot accurately predict 
homogeneity and hemopoiesis activities of cells bearing the markers. The discovery of 
genes differentially expressed in HSCs provides novel molecular markers for selecting and 
enriching HSCs. For example, antibodies against novel surface markers disclosed in the 
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present invention (e.g., those in Tables 2, 3, 4 and 5) can be used to isolate human and 
mouse HSCs from a crude population of cells (e.g., bone marrow or peripheral blood). The 
methods can also be directed to cell populations already enriched for one or more of the 
known HSCs makers (e.g., CD34+, Thy+ in human, and CD38+, c-kit+, Scal+ in mice). 
Further enrichment using these novel markers can lead to more homogeneous HSCs with 
more potent hematopoiesis activities. 

In both the autologous and allogeneic setting, the time to recover from BMT 
is directly related to the dose of HSCs transplanted. Even a modest 2 to 3-fold expansion of 
engraftable HSC would afford great benefit to patients by minimizing the duration of 
cytopenia when patients are most susceptible to infection. Thus, isolation and expansion of 
more homogeneous HSCs in vitro in accordance with the present invention would make 
autologous and allogeneic HSC transplantation safer and more effective. 

The practice of the present invention will employ, unless otherwise indicated 
conventional techniques of cell biology, molecular biology, cell culture, immunology and 
the like which are in the skill of one in the art. These techniques are frilly disclosed in the art, 
e.g., in Sambrook et al., "Molecular Cloning A Laboratory Manual," Cold Springs Harbor 
Laboratory Press (3rd ed. 2001); Carter and Sweet, "Methods of Enzymology " Academic 
Press (1997); and Harlow and Lane, "Antibodies, A Laboratory Manual," Cold Spring 
Harbor Press (1998). 

The following sections provide more specific guidance for making and using 
the compositions of the invention, and for carrying out the methods of the invention. 
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Table 1 . Genes upregulated in human CD34+Thy+ HSCs from peripheral blood 



Classification 


Name 


Description 


Histone 


H2BFL 


Homo sapiens H2B histone family, memberA 


Histone 


H2AFA 


Human histone genes 


Histone 


H2A/1 


Homo sapiens H2A histone family, member L 


Histone 


H1F2 


Histone 2A-Hke protein gene 


Histone 


H2B/h 


Homo sapiens H2B histone family, member H 


Histone 


HH2A/C 


Human histone H2AFC gene 


Histone 


H2AFQ 


Homo sapiens H2A histone family, member Q 


HLA 


HLA-DPB 1 


Human MHC class II lymphocyte antigen beta chain 


HLA 


HLA-DQB1 


Human MHC class 11 HLA-DR2-Dwl2 mRNA DQwI-beta 


HLA 


HLA-E 


Homo sapiens HLA-E gene 


Secreted-complement 


PTS 


Homo sapiens 6-pyruvoyltetrahydroprotein synthase 


Secreted-comp lement 


HFLI 


Human factor H homologue mRNA complete cds 


Secreted-growth factor 


MDK 


Homo sapiens midkine (neurite growth-promoting factor 2) 


Secreted-hormone 


OXT 


Homo sapiens oxytocin, prepro-<neurophysin I) mRNA 


Secreted-horraone 


AVP 


Homo sapiens arginine vasopressin mRNA | 


Signaling-GTP 


R-Ras 


Human R-ras 


Signaling-GTP 


GCHFR 


Hnmn ^aniens GTP cvclohydrolase I feedback regulatory protein 


Signaling-GTP 


GUCY1A3 


Homo sapiens guanylate cyclase 1 , soluble, alpha 3 | 


Signaling-Kinase 


WAF1 


Human DNA sequence from PAC 43 1 A14WAF I 


Signaling-Kinase 


ITPKJB 


Homo sapiens inositol 1 ,4,5-triphosphate 3-kinase B 


Signaling-Kinase 


PPKCL 


Homo sapiens protein kinase C, eta 


Signaling-Kinase 


PPKCZ 


Homo sapiens protein kinase C, zeta I 


Signaling-SH3 


SKAP55 


Homo sapiens sre kinase-associated phosphoprotein of 55kDa j 


Stress 


PTGS2 


Homo sapiens prostaglandin-endoperoxide synthase 2 




CYP2A13 


Human cytochrome P450 


Stress 


CYP2D6 


Human mRNA for cytochrome P450 dbl variant b 


Stress-apoptosis 


BC12A1 


Homo sapiens BCL-2-related protein 1 


Structural 


CALB1 


Homo sapiens calbindin 1 | 
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OU UlAUJ til 


Elasttn 


nuiiiaii cidsiui gene 


Oil Uwiuidi 


rviv 1 lo 


Human nuvlxA tragrnent tor cyiOKeratin i o 


ounacc-ig 


lVJlVl 


Human gene for immunoglobulin mu 


Surface- Ig 


VH4 


Human IgM heavy chain variable V-D-J region (VH4) gene 


Surface-other 


ADD 

ArT 


Homo sapiens APP complete sequence 


Surface-receptor 


BDKJIBI 


Human bradykinin Bl receptor 


Surface-receptor 


TLR1 


it nvr a c i/i a a nni «i — 

Human mRNA for KIAA0012 gene 


Surface-receptor 


5T4 


Homo sapiens 5T4 oncofetal trophoblast glycoprotein 


Surface-receptor 


EFL^2 


Homo sapiens EHK1 receptor tyrosine kinase ligand 


Surface-receptor 


EV12A 


Homo sapiens ecotropic viral integration site 2 A 


Surface-receptor 


FLT3 


Homo sapiens fms-related tyrosine kinase 3 


Surface-receptor 


TNFSF10 


Human tumor necrosis factor (ligand) superfamiiy, member 10 


Surface-receptor 


LTB 


Human lympho toxin beta 


Surface-receptor 


CDW52 


Homo sapiens mRNA for CAM PATH- 1 


Surface-receptor 


CLECSF2 


Homo sapiens C-type lectin (activation-induced) 


Surface-unknown 


GliPR 


Human glioma pathogenesis-related protein 


Transport 


LRP 


Homo sapiens Irp mRNA 


Transcription-RUNT 


AM LI 


Human AMLI protein 


Transcription-PAR-bZIP 


TEF 


Human hepatic leukemia factor 


Transcription-FKH 


FKHR 


Homo sapiens forkhead protein 


Transcription-suppressor 


MN1 


Homo sapiens chromosome 22ql 12 MDR region 


Transcnption-bHLH 


1D1 


Homo sapiens mrubitor of DNA binding 1 


1 ransenpt lon-oHLH 


ID3 


Homo sapiens HLH 1 R2I mRNA for helix-loop-helix protein 


l ranscnption-DHLH 


EPASI 


Homo sapiens endothelial PAS domain protein 1 


i ranscnpuon-DriL»ri 


1I"Y> 


Homo sapiens uiniDitor oi una binding l 


Tr^n errt nfirtn H ATA 

1 1 uiiscnpuon-o A i /\ 


ur. ATA "J 
HUAlAi 


Homo sapiens GATA-binding protein 3 


Transcription-HMG 


hTcf-4 


Homo sapiens mRNA for hTCF-4 


Transcription-HOX 


PHOX1 


Human homeobox protein 


Transcription-HOX 


MEIS1 


Homo sapiens MEIS protein 


Transcription-slicing 


RBP-MS 


Homo sapiens RN A-binding protein gene with multiple slicing 


Transcription-Translation 


TCEA2 


Homo sapiens transcription elongation factor A 
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Unknown 




IEX-I= radiation-inducible immediate-early gene 




Unknown 




Hrtmo «ani ens chromosome 17c lone hRPC.906 A 24 




Unknown 




Homo sapiens chromosome 22ql3 BAC clone CIT987SK-384D8 




Unknown 


A-362G6.1 


Human chromosome 16 BAC clone C1T987SK-A-362G6 




Unknown 


LST1 


Homo sapiens LST1 mRNA 




Unknown 


KIAA0125 


Homo sapiens KIAA0125 gene product 




Table 2. Genes Upreg 


ulated in Human HSCs from both Bone Marrow and Peripheral Blood 


Classification 


Name 


Description 


Hormone 


AVP 


Homo sapiens arginine vasopressin mRNA 


Hormone 




Corticotropin releasing hormone-binding protein 


Enzyme 


GUCYl A3 


Homo sapiens guanylate cyclase 1 , soluble, alpha 3 


Enzyme 


PPKCZ 


Homo sapiens protein kinase C, zeta 


Enzyme 




Iduronate 2-sulfatase (Hunter syndrome) 


Transcription factor 


HLF 


Human hepatic leukemia factor ^ 


Transcription factor 


GATA3 


Homo sapiens GATA-binding protein 3 

Homo sapiens ecotropic viral integration site 1 , . 


Transcription 
Transcription 


Evil 
PMX1 


Paired mesoderm homeo box 1 


Transcription 


MN1 


Meningioma (disrupted in balanced translocation) 


Secreted protein 




Tetranectin (plasminogen-binding protein) 


Secreted protein 




H factor (complement)-like 1 


Surface molecule 




Transient receptor potential channel 1 


Surface molecule 


DLK.1 


Delta-like homolog (Drosophila) 


Surface molecule 


EphA3 


Ephrin-A3 


Surface molecule 


TNFSF10 


Human tumor necrosis factor (ligand) superfemily, member 10 


Surface molecule 




Interferon induced transmembrane protein 


Surface molecule 




Ecotropic viral integration site 2A 


Surface molecule 




Sortilin-related receptor, L (DLR class) A rep 


Surface molecule 




Major histocompatibility complex, class I, E 


Surface molecule 




KIAAO 125 gene product 
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IL Definition 

Unless defined otherwise, all technical and scientific terms used herein have 
the same meaning as commonly understood by those of ordinary skill in the art to which this 
invention pertains. The following references provide one of skill with a general definition of 
many of the terms used in this invention: Singleton et ai, DICTIONARY OF MICROBIOLOGY 
And Mouecular Biology (2d ed. 1994); The Cambridge Dictionary of Science and 
Technology (Walker ed., 1988); and Hale & Marham, The Harper Collins Dictionary 
OF BIOLOGY (1991). In addition, the following definitions are provided to assist the reader in 
the practice of the invention. 

The term "analog" is used herein to refer to a molecule that structurally 
resembles a reference molecule but which has been modified in a targeted and controlled 
manner, by replacing a specific substituent of the reference molecule with an alternate 
substituent. Compared to the reference molecule, an analog would be expected, by one 
skilled in the art, to exhibit the same, similar, or improved utility. Synthesis and screening 
of analogs, to identify variants of known compounds having improved traits (such as higher 
binding affinity for a target molecule) is an approach that is well known in pharmaceutical 
chemistry. 

As used herein, "contacting" has its normal meaning and refers to combining 
two or more agents (e.g., polypeptides or small molecule compounds) or combining agents 
and cells (e.g., a polypeptide and a cell). Contacting can occur in vitro, e.g., combining two 
or more agents or combining a test agent and a cell or a cell lysate in a test tube or other 
container. Contacting can also occur in a cell or in situ, e.g., contacting two polypeptides in 
a cell by coexpression in the cell of recombinant polynucleotides encoding the two 
polypeptides, or in a cell lysate. 

An "effective amount or dose" is an amount sufficient to effect beneficial or 
desired results. An effective amount may be administrated in one or more administrations. 
Determination of an effective amount is within the capability of those skilled in the art. 
Particularly preferred subjects of the invention in general include living mammals such as 
human, mice and rabbit, most preferred are humans. The administration of an HSC 
differentiation-inhibiting polypeptide, or a genetically modified cell comprising a 
polynucleotide sequence of the invention, may be by conventional means, for example, 
injection, oral administration, inhalation and others. Appropriate carries and diluents may be 
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tabfcd in to adnata of to polypeptide or to modified cel.s. Samples including 
fcemodifiedcens and progeny thereof may be taken and tested to detennine transduction 
efficiency. 

The tarn "fragment" when used in connection with an ammo acid sequence 
.neansapartofareferenca sequence and having a, leas, ,0 amino acid residues, preferably 
50 amino acids rostdnes, even more preferably 100 amino acid residues and most preferably 
200 amino acid residues which are substantially identical to to reference ammo actd 

includingpart of the reference sequence and comprising as few aa a, leas. 30, 50, 75, 80, 100 
or more contiguous nucleotides, preferably at least 200, 500, 400, 500, 600, or more 
contiguous nuclides, even more preferab,y a, teas. 800, . 000, . 500, 2000 or more 
contiguous nucleotides that are identical to to reference sequence. 

The term "functional equivalent" when referring to a polypeptide means a 
protein having a like fi.nc.ion and like or improved speeffic activity, and a similar amino 
acid sequence. In some embodiment a fnnc.ion.dly equivalent is a varian. in whrch one or 
more amino acid residues are substituted with conserved o, non-conaerved ammo acrd 
residues, or one in which one or more amino acid residues mcludes a substituent group. 
Conservative substitutions are to replacements, one for anotor, among to aliphatic ammo 
acids Ala, Val, Leu aad fie; interchange of to hydrox! residues Ser and Thr; exchange of to 
aefdic residues Asp and Gin; substitution between amtde residues Asn and Ota exchange of 
*. basic residues Lys and Arg; and replacements among aromatic residues Phe and Tyr. 

A "heterologous sequence" or a "heterologous nneleie acid," as used herem, 
is nne tot ori^ates from a source foreign ,o to particnla, host cell, or, Mom to same 
source, is modified from its original form. Thus, a heterologous gene in a host cell mcludes 
a gene tat, altough being endogenous ha to particular host cell, has been modtfied. 
Modification of to heterologous sequence can occur, e.g., by heating to DNA wrth a 
restiiction enzyme to generate a DNA fragment to. is capable of being operably linked te 
to promoter. Techniques such a, site-directed mutagenesis are also useful for modtfytng a 

heterologous nucleic acid. 

The term "homologous" when rofening to proteins and/or protein sequences 

indica.es to. toy are derived, naturally or artificially, from a common anceshul protem or 
protein sequence. Similarly, nucleic acids and/or nucleic acid sequences are homologous 
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when they are derived, naturally or artificially, from a common ancestral nucleic acid or 
nucleic acid sequence. Homology is generally inferred from sequence similarity between 
two or more nucleic acids or proteins (or sequences thereof). The precise percentage of 
similarity between sequences that is useful in establishing homology varies with the nucleic 
acid and protein at issue, but as little as 25% sequence similarity is routinely used to 
establish homology. Higher levels of sequence similarity, e.g., 30%, 40%, 50%, 60%, 70%, 
80%, 90%, 95% or 99% or more can also be used to establish homology. Methods for 
determining sequence similarity percentages, e.g., BLASTP and BLASTN using default 
parameters, are well known and described in the art. 

The terms "identical sequence" and "sequence identity" in the context of two 
nucleic acid sequences or amino acid sequences refer to the residues in the two sequences 
which are the same when aligned for maximum correspondence over a specified comparison 
window. A "comparison window", as used herein, refers to a segment of at least about 20 
contiguous positions, usually about 50 to about 200, more usually about 100 to about 150 in 
which a sequence maybe compared to a reference sequence of the same number of 
contiguous positions after the two sequences are aligned optimally. Methods of alignment of 
sequences for comparison are well-known in the art. Optimal alignment of sequences for 
comparison may be conducted by the local homology algorithm of Smith and Waterman 
(1981) Adv. Appl. Math. 2:482; by the alignment algorithm of Needleman and Wunsch 
(1970) J. Mol. Biol. 48:443; by the search for similarity method of Pearson and Lipman 
(1988) Proc. Nat. Acad. Sci U.S.A. 85:2444; by computerized implementations of these 
algorithms (including, but not limited to CLUSTAL in the PC/Gene program by 
Intelligentics, Mountain View, CA; and GAP, BESTFIT, BLAST, FASTA, or TFASTA in 
the Wisconsin Genetics Software Package, Genetics Computer Group (GCG), 575 Science 
Dr., Madison, Wis, U.S.A.). The CLUSTAL program is well described by Higgins and 
Sharp (1988) Gene 73:237-244; Higgins and Sharp (1989) CABIOS 5:151-153; Coipet etal. 
(1988) Nucleic Acids Res. 16:10881-10890; Huang et al (1992) Computer Applications in 
the Biosciences 8:155-165; and Pearson et al. (1994) Methods in Molecular Biology 24:307- 
33 1 . Alignment is also often performed by inspection and manual alignment 

The term "isolated" means that the material is removed from its original 
environment (e.g., the natural environment if it is naturally occurring). For example, a 
naturally-occurring nucleic acid, polypeptide, or cell present in a living animal is not 
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isolated, but the same polynucleotide, polypeptide, or cell separated from some or all of the 
coexisting materials in the natural system, is isolated, even if subsequently reintroduced into 
the natural system. Such nucleic acids can be part of a vector and/or such nucleic acids or 
polypeptides could be part of a composition, and still be isolated in that such vector or 
composition is not part of its natural environment. When referring to a cell population, it 
means that homogeneous cells expressing a given set of molecular markers constitute at least 
60%, preferably 75%, more preferably 90%, and most preferably 95% of the total number of 

cells in the population. 

The terms "substantially identical" nucleic acid or amino acid sequences 
means that a nucleic acid or amino acid sequence comprises a sequence that has at least 90% 
sequence identity or more, preferably at least 95%, more preferably at least 98% and most 
preferably at least 99%, compared to a reference sequence using the programs described 
above (preferably BLAST) using standard parameters. For example, the BLASTN program 
(for nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, 
M=5> N =_4, and a comparison of both strands. For amino acid sequences, the BLASTP 
program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the 
BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Nath Acad. Sci. USA 89:10915 
(1 989)). Percentage of sequence identity is determined by comparing two optimally aligned 
sequences over a comparison window, wherein the portion of the polynucleotide sequence in 
the comparison window may comprise additions or deletions (i.e., gaps) as compared to the 
reference sequence (which does not comprise additions or deletions) for optimal alignment 
of the two sequences. The percentage is calculated by determining the number of positions 
at which the identical nucleic acid base or amino acid residue occurs in both sequences to 
yield the number of matched positions, dividing the number of matched positions by the total 
number of positions in the window of comparison and multiplying the result by 100 to yield 
the percentage of sequence identity. Preferably, the substantial identity exists over a region 
of the sequences that is at least about 50 residues in length, more preferably over a region of 
at least about 100 residues, and most preferably the sequences are substantially identical over 
at least about 150 residues. In a most preferred embodiment, the sequences are substantially 
identical over the entire length of the coding regions. 

The terms "nucleic acid" and "polynucleotide" refer to a deoxyribonucleotide 
or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise 
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limited, encompasses known analogues of natural nucleotides that hybridize to nucleic acids 
in manner similar to naturally occurring nucleotides. A "polynucleotide sequence" is a 
nucleic acid (which is a polymer of nucleotides (A,C,T,U,G, etc. or naturally occurring or 
artificial nucleotide analogues) or a character string representing a nucleic acid, depending 
on context. Either the given nucleic acid or the complementary nucleic acid can be 
determined from any specified polynucleotide sequence. 

The term "operably linked" refers to a functional relationship between two or 
more polynucleotide (e.g., DNA) segments. Typically, it refers to the functional relationship 
of a transcriptional regulatory sequence to a transcribed sequence. For example, a promoter 
or enhancer sequence is operably linked to a coding sequence if it stimulates or modulates 
the transcription of the coding sequence in an appropriate host cell or other expression 
system. Generally, promoter transcriptional regulatory sequences that are operably linked to 
a transcribed sequence are physically contiguous to the transcribed sequence, i.e., they are 
cis-acting. However, some transcriptional regulatory sequences, such as enhancers, need not 
be physically contiguous or located in close proximity to the coding sequences whose 
transcription they enhance. A polylinker provides a convenient location for inserting coding 
sequences so the genes are operably linked to the promoter. Polylinkers are polynucleotide 
sequences that comprise a series of three or more closely spaced restriction endonuclease 
recognition sequences. 

As used herein the term "overexpression" refers to expression of a 
polypeptide brought about by genetic modification of a host cell with a nucleic acid 
sequence encoding the polypeptide. Overexpression may take place in cells normally 
lacking expression of the polypeptide (e.g., an HSC differentiation-inhibiting polypeptide). 
It can also occur in cells with endogenous expression of the polypeptide. While 
overexpression may take place in any cell type, preferred host cells for overexpressing an 
HSC differentiation-inhibiting polypeptide are hematopoietic stem cells. 

The terms "polypeptide" and "protein" are used interchangeably herein, and 
refer to a polymer of amino acid residues, e.g., as typically found in proteins in nature. A 
"mature protein" is a protein which is full-length and which, optionally, includes 
glycosylation or other modifications typical for the protein in a given cell membrane. 

A "variant" of a molecule such as an HSC differentiation-inhibiting 
polypeptide is meant to refer to a molecule substantially similar in structure and biological 
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activity to either the entire molecule, or to a fragment thereof. Thus, provided that two 
molecules possess a similar activity, they are considered variants as that term is used herein 
even if the composition or secondary, tertiary, or quaternary structure of one of the 
molecules is not identical to that found in the other, or if the sequence of amino acid residues 
is not identical. In some embodiments, a variant differs in amino acid sequence from a 
reference polypeptide by one or more substitutions, additions, deletions, truncations which 
may be present in any combination. Among preferred variants are those that vary from a 
reference polypeptide by conservative amino acid substitutions. Such substitutions are those 
that substitute a given amino acid by another amino acid of like characters. The following 
non-limiting list of amino acids are considered conservative replacements: a) alanine, serine, 
and threonine; b) glutamic acid and asparatic acid; c) asparagine and glutamine d) arginine 
and lysine; e) isoleucine, leucine, methionine and valine and f) phenylalaine, tyrosine and 
tryptophan. Most highly preferred are variants that retain the same biological function and 
activity as the reference polypeptide from which it varies. 

HI. Promoting HS <" Ex pansion b y Tnhibiting Differentiation 

In addition to novel markers and methods for isolating HSCs, the invention 
also provides methods for inhibiting or blocking differentiation of mammalian hematopoietic 
stem cells, thereby promoting expansion of the stem cells. A number of the novel HSC 
marker genes identified in the present invention can inhibit or block HSC differentiation. 
Examples of such differentiation-inhibiting genes are shown in Tables 1 and 2 (for human 
HSC) and Tables 3 and 4 (for mouse HSC). For example, as described in the Examples 
below, human stem cells overexposing GATA-binding protein 3 slows differentiation of 
the cells. HSCs overexposing ID3 increased colony forming cells, indicating enhanced 
HSC activity as compared to a control. These differentiation-inhibiting molecules can be 
used in the present invention to inhibit HSC differentiation and thereby promoting expansion 
in vitro. They can also be used in vivo to increase the effective dose of engrafted HSCs in a 
subject. 

The term HSC differentiation-inhibiting molecules (polynucleotides and the 
encoded polypeptides) include the molecules shown in Tables 1-4 that inhibit or slow HSC 
differentiation. Polynucleotides with substantial sequence identity are also encompassed. In 
addition, they also include variants, analogs, fragments, or functional derivatives of the HSC 
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differentiation-inhibiting molecules shown in Tables 1-4. These differentiation-inhibiting 
molecules can be obtained from any species. Preferably, they are from mammalian species 
including human, mouse, and chicken. The HSC differentiation-inhibiting molecules can 
also be from any source whether natural, synthetic or recombinant. 

Differentiation is defined as the restriction of the potential of a cell to self- 
renew and is normally associated with a change in the functional capacity of the cell. The 
term "inhibiting" or "blocking" differentiation is used broadly in the context of this invention 
and includes not only the prevention of differentiation but also encompasses altering or 
slowing differentiation process of a cell. Differentiation of a stem cell can be determined by 
methods well known in the art and these include analysis for surface markers associated with 
cells of a defined differentiated state. 

An HSC differentiation-inhibiting polypeptide of the present invention 
encodes an HSC differentiation-inhibiting polypeptide that blocks or slows down 
differentiation of the HSC cells (e.g., as listed in Tables 1-4). As shown in the Tables, these 
molecules include hormones, secreted proteins, or growth factors. These molecules also 
include transcription factors. One or more of these HSC differentiation-inhibiting 
polypeptides, or fragments thereof, can be applied to HSC cells in vitro, e.g., in a cell 
culture. These cells can be cultured and grown as described herein or other methods well 
known in the art. The appropriate amount of these differentiation-inhibiting polypeptides to 
be used in the cultures can be easily determined in accordance with stem cell culturing 
procedures described herein or knowledge well known in the art. By culturing the HSC in 
the presence of these molecules, differentiation of the cells can be inhibited or slowed, 
resulting in enhanced growth of engraftable HSCs. 

In addition to promoting HSC expansion in vitro, the HSC differentiation- 
inhibiting polypeptides of the invention can also be administered directly to a subject to 
promote in vivo growth of HSCs. For example, a subject engrafted with bone marrow or a 
population of HSCs can also be administered an effective amount of an HSC differentiation- 
inhibiting polypeptide or fragment thereof (e.g., the secreted proteins or growth factors 
shown in Table 1 and Tables 3-4). The polypeptide can be administered to the subject prior 
to, concurrently with, or subsequent to transplantation of the bone marrow or HSCs. 
Preferably, the polypeptide and the HSCs are administered to the subject simultaneously. 
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Other than using a differentiation-inhibiting polypeptide, inhibition of HSC 
differentiation can also be achieved using an HSC differentiation-inhibiting polynucleotide 
to genetically modify HSCs. HSC differentiation-inhibiting polynucleotides suitable for 
these methods include some of the genes upregulated in HSCs (as shown in Tables 1 and 3). 
They encode HSC differentiation-inhibiting polypeptides that block or slow down 
differentiation of the HSC cells. Some of these methods require first isolation of a 
population of hematopoietic cells, e.g., a population ofCD34 + Thy + human cells or CD34 
CD38+ mouse cells as described above, from a source of such cells. An HSC differentiation- 
inhibiting polynucleotide of the invention can then be introduced into the cells whereby the 

cells are genetically modified. 

Once the cells are genetically modified, they are cultured in the presence of at 
least one cytokine in an amount sufficient to support growth of the modified cells. The 
modified cells are then selected wherein the encoded polypeptide is overexposed and 
differentiation is blocked. The genetically modified cells thus obtained may be used 
lately (e.g., m transplant), cultured and expanded m vitro, or stored for later uses. The 
modified HSCs may be stored by methods well known in the art, e.g., frozen in liquid 
nitrogen. 

Genetic modification as used herein encompasses any genetic modification 
method of introduction of an exogenous or foreign gene into mammalian cells (particularly 
human stem cell and hematopoietic cells). The term includes but is not limited to 
transduction (viral mediated transfer of host DNA from a host or donor to a reciprent, enher 
in vitro or in vivo), transfection (transformation 'of cells with isolated viral DNA genomes), 
liposome mediated transfer, electroporation, calcium phosphate transfection or 
coprecipitation and others. Methods of transduction include direct co-culture of cells with 
producer cells (Bregni et al, Blood 80:141 8-1422, 1992) or culturing with viral supernatant 
alone with or without appropriate growth factors and polycations (Xu et al., Exp. Hemat. 

22:223-230, 1994). 

Various in vitro and in vivo assays are well known in the art for the 

measurement of the functional compositions of hematopoietic cell populations. See, e.g., 
Quesenberry et al. eds., Stem Cell Biology and Gene Therapy, Wiley-Liss Inc. 1998- 
Chapter 5, Hematopoietic Stem cells: Proliferation, Purification and Clinical Applications, 
pgs 133-160. Other examples of suitable assays are also known in the art. For example, the 
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long term culture-initiating cell (LTCIC) assay involves culturing a cell population on 
stromal cell monolayers for approximately 5 weeks and then testing in a 2 week semisolid 
media culture for the frequency of clonogenic cells retained (Sutherland et al., Blood 
74:1563 (1989)). The Colony Forming Cells (CFC) assay or Colony-Forming Unit Culture 
(CFUC) assay involves use of cell count as the number of colony-forming units per unit 
volume or area of a sample. The assay is used to measure clonal growth of quickly maturing 
progenitors in semi-solid media supplemented with serum and growth factors. Depending 
on the growth factors used to stimulate growth mature and/or primitive progenitors may be 
determined. Cobblestone area forming colony (CAFC) assays measure clonal proliferation 
of long-lived progenitors supported by stromal cell monolayers and growth factor/serum 
supplemented media. On the appropriate stromal monolayers, cells pluripotent for myeloid 
and lymphoid lineages may be determined. (Young et al., Blood 88: 1619, 1996). SCDD-hu 
bone assays measure the proliferation and multilineage differentiation of cells with bone 
marrow repopulating activity. These cells are likely to contribute to durable engraftment in 
clinical transplantation. SCID-hu thymus assays measure the proliferation and differentiation 
in thymocytes. Both bone marrow repopulating and more mature T-lineage progenitors may 

be measured. * 

A polynucleotide encoding an HSC differentiation-inhibiting molecule is 
typically introduced to a host cell in a vector. The vector typically includes the necessary 
elements for the transcription and translation of the inserted coding sequence. Methods used 
to construct such vectors are well known in the art. For example, techniques for constructing 
suitable expression vectors are described in detail in Sambrook et al., Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Press, N.Y. (3 rd Ed., 2000); and Ausubel et al., 
Current Protocols in Molecular Biology, John Wiley & Sons, Inc., New York (1 999). 

Vectors may include but are not limited to viral vectors, such as baculovirus, 
retroviruses, adenoviruses, adeno-asso dated viruses, and herpes simplex viruses; 
bacteriophages; cosmids; plasmid vectors; synthetic vectors; and other recombination 
vehicles typically used in the art. Vectors containing both a promoter and a cloning site into 
which a polynucleotide can be operatively linked are well known in the art. Such vectors are 
capable of transcribing RNA in vitro or in vivo, and are commercially available from sources 
such as Stratagene (La Jolla, Calif.) and Promega Biotech (Madison, Wis.). Specific 
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examples include, pSG, pSV2CAT, pXtl from Stratagene; and pMSG, pSVL, pBPV and 

pSVK3 from Pharmacia. 

Preferred vectors include retroviral vectors (see, Coffin et al., "Retroviruses", 
Chapter 9 pp; 437-473, Cold Springs Harbor Laboratory Press, 1997). Vectors useful in the 
invention can be produced recombinantly by procedures well known in the art. For example, 
W094/29438, W097/21824 and W097/21825 describe the construction of retroviral 
packaging plasmids and packing cell lines. Exemplary vectors include the pCMV 
mammalian expression vectors, such as pCMV6b and pCMV6c (Chiron Corp.), pSFFV- 
Neo, and pBluescript-Sk+. Non-limiting examples of useful retroviral vectors are those 
derived from murine, avian or primate retroviruses. Common retroviral vectors include 
those based on the Moloney murine leukemia virus (MoMLV-vector). Other MoMLV 
derived vectors include, Lmily, LINGFER, MINGFR and MINT (Chang et al., Blood 92:1- 
11, 1998). Additional vectors include those based on Gibbon ape leukemia virus (GALV) 
and Moloney murine sacroma virus (MoMSV) and spleen focus forming virus (SFFV). 
Vectors derived from the murine stem cell virus (MESV) include MESV-MiLy (Agarwal et 
al., J. of Virology, 72:3720-3728, 1998). Retroviral vectors also include vectors based on 
lentiviruses, and non-limiting examples include vectors based on human immunodeficiency 

virus (HIV-1 andHIV-2). 

In producing retroviral vector constructs, the viral gag, pol and env sequences 
can be removed from the vims, creating room for insertion of foreign DNA sequences. 
Genes encoded by foreign DNA are usually expressed under the control a strong viral 
promoter in the long terminal repeat (LTR). Selection of appropriate control regulatory 
sequences is dependent on the host cell used and selection is within the skill of one in the art. 
Numerous promoters are known in addition to the promoter of the LTR. Non-limiting 
examples include the phage lambda PL promoter, the human cytomegalovirus (CMV) 
immediate early promoter, the U3 region promoter of the Moloney Murine Sarcoma Virus 
(MMSV), Rous Sacroma Virus (RSV), or Spleen Focus Forming Virus (SFFV); Granzyme 
A promoter; Granzyme B promoter, CD34 promoter; and the CD8 promoter. Additionally 
inducible or multiple control elements may be used. 

Such a construct can be packed into viral particles efficiently if the gag, pol 
and env functions are provided in trans by a packing cell line. Therefore, when the vector 
construct is introduced into the packaging cell, the gag-pol and env proteins produced by the 
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cell, assemble with the vector RNA to produce infectious virons that are secreted into the 
culture medium. The virus thus produced can infect and integrate into the DNA of the target 
cell, but does not produce infectious viral particles since it is lacking essential packaging 
sequences. Most of the packing cell lines currently in use have been transfected with 
separate plasmids, each containing one of the necessary coding sequences, so that multiple 
recombination events are necessary before a replication competent virus can be produced. 
Alternatively the packaging cell line harbors a provirus. The provirus has been crippled so 
that although it may produce all the proteins required to assemble infectious viruses, its own 
RNA cannot be packaged into virus. RNA produced from the recombinant virus is packaged 
instead. Therefore, the virus stock released from the packaging cells contains only 
recombinant virus. Non-limiting examples of retroviral packaging lines include PA12, 
PA317, PE501, PG13, PSI.CRIP, RD114, GP7C-tTA-G10, ProPak-A (PPA-6), and PT67. 
Reference is made to Miller et al., Mol. Cell Biol. 6:2895, 1986; Miller et al., Biotechniques 
7:980, 1989; Danos et al., Proc. Natl. Acad. Sci. USA 85:6460, 1988; Pear et al., Proc. Natl. 
Acad. Sci. USA 90:8392-8396, 1993; and Finer et al., Blood 83:43-50, 1994. 

Other suitable vectors include adenoviral vectors (see, Frey et al., Blood 
91 :2781, 1998; and WO 95/27071) and adeno-associated viral vectors. These vectors are all 
well know in the art, e.g., as described in Chatterjee et al., Current Topics in Microbiol. And 
Immunol., 218:61-73, 1996; Stem cell Biology and Gene Therapy, eds. Quesenberry et al., 
John Wiley & Sons, 1998; and U.S. Pat. Nos. 5,693,53 1 and 5,691,176. The use of 
adenovirus-derived vectors may be advantageous under certain situation because they are not 
capable of infecting non-dividing cells. Unlike retroviral DNA, the adenoviral DNA is not 
integrated into the genome of the target cell. Further, the capacity to carry foreign DNA is 
much larger in adenoviral vectors than retroviral vectors. The adeno-associated viral vectors 
are another useful delivery system. The DNA of this virus may be integrated into non- 
dividing cells, and a number of polynucleotides have been successful introduced into 
different cell types using adeno-associated viral vectors. 

In some embodiments, the construct or vector will include two or more 
heterologous polynucleotide sequences; a) the nucleic acid sequence encoding an HSC 
differentiation-inhibiting polypeptide of the invention, and b) one or more additional nucleic 
acid sequence. Preferably the additional nucleic acid sequence is a polynucleotide which 
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encodes a selective marker, a stmctaa. gene, a therapentic gene, a ribozyme, or an antisense 
sequence. 

A selective marker may be included in the construct or vector for the 
purposesofmonitoring successful genetic modification and for selection of cells into which 
DNA has been integrated. Non-limiting examples include drug resistance markers, such as 
G148orhygromycin. Additionally negative selection may be used, for example wherem the 
marker is the HSV-tk gene. This gene will make the cells sensitive to agents such as 
acyclovir and gancyclovir. Selection may also be made by using a cell surface marker,for 
example, to select overexpression of an HSC differentiation-inhibiting polypepude by 
fluorescence activated cell sorting (FACS). TheNeoR (neomycin/G148 resistance) gene rs 
commonly usedbut any convenient marker gene may be used whose gene sequences are not 
already present in the target cell can be used. Further non-hmiting examples include low- 
affinity Nerve Growth Factor (NGFR), enhanced fluorescent green protem (EFGP), 
dihydrofolate reductase gene (DHFR) the bacterial hisD gene, murine CD24 (HSA), murine 
CD8a(lyt), bacterial genes which confer resistance to puromycin or phleomycm, and beta, 
glactosidase. 

The additional polynucleotide sequences) may be introduced into the host 
cell on the same vector as the polynucleotide sequence encoding the polypeptides of the 
invention or the additional polynucleotide sequence may be introduced into the host cells on 
asecond vector. m a preferred embodiment, a selective marker vail be mcluded o^ 
vector as the HSC differentiation-inhibiting polynucleotide. 

Typically, the host cells for expressing the HSC differentiation-inhibitmg 
polynucleotide are mammalian stem cells, e.g., HSCs from humans, mice, monkeys, farm 
animals, sport animals, pets, and other laboratory rodents and animals. These cells can be 
obtained, cultured, and manipulated as described above and in Potten C. S. ed., Stem Cells, 
Academic Press, 1997; Stem Cell Biology and Gene Therapy, ed, Quesenberry et al., John 
Wiley & Sons Inc., 1998; and Gage et al., Ann. Rev. Neurosci. 18:159-192, 1995. 

IV tsw. Molecular Markers for Isolating and Enriching HSCs 

As detailed in the Examples below, me present inventor identified a number 
of genes that are differentially expressed inhuman and mouse HSCs. These genes, which 
canplay a role in regulating hemopoiesis as well as activities of HSCs and progenitor cells, 
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are suitable as markers for selecting and enriching HSCs from diverse populations of cells. 
As exemplified in Tables 1-4, these HSC markers include transmembrane proteins (e.g., 
receptors), growth factor, transcription factors, as well as other proteins with diverse cellular 
and biochemical functions. 

Employing these novel HSC markers, the present invention provides methods 
for isolating stem cells from any vertebrate, particularly mammalian, species. In general, 
one or more of the novel markers can be targeted in the methods. Selection with these 
markers can be performed alone with a crude population of cells (e.g., bone marrow). The 
selection scheme can also be used in combination with other selection and purification 
procedures, e.g., to further select HSCs from cells already enriched for other known HSC 
surface markers. 

In some embodiments, the novel markers for selecting and enriching HSCs 
are cell surface markers. As described in the Examples, a number of the genes upregulated 
in the human and mouse HSCs encode transmembrane proteins (see also Tables 2 and 7). 
These proteins provide novel surface markers for isolating HSCs from or enumerating HSCs 
in a population of diverse cells (e.g., bone marrow). These methods are useful for isolating 
stem cells from primates, e.g. human, monkeys, gorillas, domestic animals, bovine, equine, 
ovine, porcine, and etc. Isolation of HSCs bearing these novel markers can be performed 
with the same procedures disclosed herein for the other phenotypic markers. 

In some embodiments, selection of the novel HSC markers utilizes antibodies 
that recognize the novel HSC markers. This includes preparing an antibody to a novel HSC 
marker (e.g., a surface marker) of the invention and purifying the antibody. By exposing a 
population of hematopoietic cells or crude cells to the antibody and allowing the exposed 
cells to bind with the antibody, cells bearing the novel HSC marker can be isolated. 
Techniques including antibody preparation and purification are well known and routinely 
practiced in the art. See, e.g., Harlow and Lane, Antibodies: A Laboratory Manual, Cold 
Spring Harbor Press (1998). Such antibodies encompass any antibody or fragment thereof 
either native or recombinant, synthetic or naturally derived, which retains sufficient 
specificity to bind specifically to an HSC marker. They may be monoclonal or polyclonal, 
and can be produced using the novel HSC marker protein or a fragment or variant thereof. 
In addition, antibodies that recognize some of these marker proteins may also be obtained 
commercially. 
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When combined with other selection procedures, the particular order by 
„hich hematopoietic cells are separated from other cells is not critical to this invention. 
When a genetically modified HSC cell is ,o be selected (a, detmled above), me specific cell 
types may be separated either pnor to genetic modification or after genetic modificahon. In 
some methods, crude cell samples are initially separated by markers indicating unwanted 
cells, menwrmanegafiveseleofion,foUowedb,sepam«onsformad=emormarkerlevels 

-rtl «l,■•b*, fc h*.-^-»*'* , ** , * 
novel markers of the present invention. In some other mefhods, Mowing the initial crude 
separation, m. cells can be directly subject to enrichment fo, a, least on. of the novel HSC 
markers. 

For example, an initial crude cell population can be first purified to remove 
major cell families from the bone marrow or other hematopoietic cell source. A negaUve 
selection can then be carried out by targeting some of the cell surface antigens (e.g., Lm, 
CD34 for mouse HSCs). A further positive selection can be performed to isolate a cell 
population with specific stem cell markers (e.g., CD34 and Thy for human HSC, and c-kit, 
Sca-1 , or CD38 for mouse HSC). Thereafter, additional selections can be earned out using 
one or more of the novel HSC surface markers disclosed herein. 

The starting cell populations for selecting and enriching HSC can be obtamed 
frombonemarroworotherhematopoieticsource. Stem cells and progenitor cells from bone 
.narrow constitute only a small percentage (e.g., about 0.01 to about 0.1%) of thebone 
marrow cells. Bone marrow cells may be obtained from a source of bone marrow, e.g. 
tibiae, femora, spine, fetal liver, and other bone cavities. Other sources of hematopoietic 

adultperipheralblood and umbilical cord blood (To et al., Blood 89:2233-2258, 1997). 

Procedures for isolation of bone marrow are well known in the art. For 
example, an appropriate solution may be used to flush thebone. For example, the solution 
can be a balanced salt solution conveniently supplemented with fetal calf serum or other 
naturally occurring factors. These components can be present in conjunction with an 

acceptable buffer at low concentration, generally from about 5 to 25 mM. Convenient 
buffers include but are not limited to HEPES, phosphate and lactate buffers. Bone marrow 

can also be aspirated from the bone in accordance with other conventional techniques well 

known in the art. 
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As indicated above, to isolate the HSC cells, a relatively crude separation can 
be initially used to remove major cell families from the bone marrow or other hematopoietic 
cell source. Various techniques may be employed to separate the cells to initially remove 
cells of dedicated lineage. These include physical separation, magnetic separation using 
antibody-coated magnetic beads, affinity chromatography, and cytotoxic agents joined to a 
monoclonal antibody or used in conjunction with a monoclonal antibody. Also included is 
the use of fluorescence activated cell sorters (FACS) wherein the cells can be separated on 
the basis of the level of staining of the particular antigens. These techniques are well known 
to those of ordinary skill in the art and are described in various references including U.S. Pat. 
Nos. 5,061,620; 5,409,8213; 5,677,136; and 5,750,397; and Yau et aL, Exp. Hematol. 
18:219-222, 1990). 

Monoclonal antibodies are particularly useful for this initial separation 
procedure. The antibodies may be attached to a solid support to allow for separation. In 
some methods, magnetic bead separations are used to attach the antibodies. Conjugating the 
antibodies with markers such as magnetic beads, e.g., using biotin-avidin link, allows for 
direct separation of bound cells from the unbound cells. Antibodies (e.g., monoclonal 
antibodies) directed to the various surface markers of these differentiated cells can be 
obtained commercially or prepared using methods routinely practiced in the art. 

To select HSCs, this initial separation allows removal of large numbers of 
cells of the hematopoietic system of various lineages, such as thymocytes, T-cells, pre-B 
cells, B-cells, granulocytes, myelomonocytic cells, and platelets. Cells that can be separated 
in this stage also include other minor cell populations, e.g., megakaryocytes, mast cells, 
eosinophils and basophils. Generally, at least about 70%, usually 80% or more of the total 
hematopoietic cells will be removed. Since there will be positive selection at the later 
selection steps, it is not essential to remove at the initial stage every dedicated cell class, 
such as the minor population members, the platelets, and erythrocytes. However, it is 
preferable that there be positive selection for all of the cell lineages, so that in the final 
positive selection the number of dedicated cells present is minimized. 

Phenotypes of surface antigen of the dedicated lineage cells are known in the 
art. For example, CD34 is expressed on most immature T-cells also called thymocytes, and 
these cells lack cell surface expression of CD1, CD2, CD3, CD4, and CD8 antigens. 
CD45RA is a useful T-cell marker. The best known T-cell marker is the T-cell receptor 
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(TCR) There are presently two defined types of TCRs, TCR-2 (consisting of a and P 
polypeptides) and TCR-1 (consisting of 5 and y polypeptides). B cells may be selected, for 
example, by expression of CD19 and CD20. Myeloid cells may be selected, for example, by 
expression of CD14, GDI 5, and CD16. NK cells may be selected based on expression of 
CD56 and CD16. Erythrocytes may be identified by expression of glycophorin A. 
Compositions enriched for progenitor cells capable of differentiation into myeloid cells, 
dendritic cells, or lymphoid cells also include the phenotypes CD45RA* CD34 + Thyl and 
CD45RA + CD10 + Lin" CD34 + . Other useful markers for various cell types are also known in 



the art. 



The separation techniques employed should maximize the retention of 
viability of the fraction to be collected. For the initial separations, various techniques of 
differing efficacy may be employed. The particular technique employed will depend upon 
efficiency of separation, cytotoxicity of the methodology, ease and speed of performance, 
and necessity for sophisticated equipment and/or technical skill. Procedures for separation 
ma y include magnetic separation, using antibody-coated magnetic beads, affinity 
chromatography, cytotoxic agents joined to a monoclonal antibody or used in conjunction 
with a monoclonal antibody, e.g. complement and cytotoxins, and -panning" with antibody 
attached to a solid matrix, e.g. plate. Techniques providing accurate separation mclude 
fluorescence activated cell sorters, which can have varying degrees of sophistication, e.g. a 
plurality of color channels, low angle and obtuse light scattering detecting channels, and 

impedance channels. 

Following the initial coarse selection, positive and/or negative selection using 
various other known stem cell markers as well as the novel HSC markers disclosed herein + 
can be followed. In some methods, human HSCs are isolated using markers such as CD34 
and Thy + as discussed in the Examples below. In some methods, human HSCs are selected 
for a phenotype of CD34+ Thyl + Lin . Other examples of enriched phenotypes include^ 
CD2- CDS", CD4-, CDS', CD10", CD14", CD15", CD19', CD20", CD33", CD34", CD38 , 
CD45RA-, CD 59 +/ -, CD71", CDW109 + , glycophorin", AC133 + , HLA"DR +/ ", c-kif , and ENT. 
Lin refers to a cell population selected on the basis of lack of expression of at least one 
lineage specific marker, for example CD2, CD3, CD14, and CD56. The combination of 
expression markers used to isolate and define an enriched HSC population may vary 
depending on various factors and may vary as other expression markers become available. 
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Similarly, mouse HSCs can be selected for one or more of the known markers 
such as Lin, c-kit + , Sca-1 + , CD3S + , and CD34' (see Example 3). In other methods, murine 
HSCs with similar properties to the human CD34 + Thy-l* Lin may be identified by kit + 
Thy-1 .l l0 Lin 710 Sca-1 + (KTLS). Other phenotypes are well known, e.g., as described in US 
Patent No. 6,451,558. When CD34 expression is combined with selection for Thy-1, a 
composition comprising approximately fewer than 5% lineage committed cells can be 
isolated (U.S. Pat. No. 5,061,620). 

Once the cells are harvested and optionally separated, the cells are cultured in 
a suitable medium comprising a combination of growth factors that are sufficient to maintain 
growth. The term culturing refers to the propagation of cells on or in media of various kinds. 
It is understood that the descendants of a cell grown in culture may not be completely 
identical (either morphologically, genetically or phenotypically) to the parent cell. Methods 
for culturing stem cells and hematopoietic cells are well known to those skilled in the art. 
Any suitable culture container may be used, and these are readily available from commercial 
vendors. The seeding level is not critical, and it will depend on the type of cells used. In 
general, the seeding level will be at least 10 cells per ml, more usually at least about 100 
cells per ml and generally not more than 10 6 cells per ml. 

Various culture media can be used and non-limiting examples include 
Iscove's modified Dulbecco's medium (IMDM), X-vivo 15 and RPMM640. These are 
commercially available from various vendors. The formulations may be supplemented with 
a variety of different nutrients, growth factors, such as cytokines and the like. In general, the 
term cytokine refers to any one of the numerous factors that exert a variety of effects on 
cells, such as inducing growth and proliferation. The cytokines may be human in origin or 
may be derived from other species when active on the cells of interest. Included within the 
scope of the definition are molecules having similar biological activity to wild type or 
purified cytokines, for example produced by recombinant means, and molecules which bind 
to a cytokine factor receptor and which elicit a similar cellular response as the native 
cytokine factor. 

The medium can be serum free or supplemented with suitable amounts of 
serum such as fetal calf serum, autologous serum or plasma. If cells or cellular products are 
to be used in humans, the medium will preferably be serum free or supplemented with 



26 



WO 2004/071443 



PCT/US2004/004007 



autologous serum orptonafsee, e. g ., LatKdorp o, a,., 1. Bxp. Med. .75,50,, ,992; and 
Petzoretal.PNAS 93:1470, 1996). 

Examples of compounds that oau be used ,o suppl.rn.nl ,h. culture medtum 
„ (TPO), F.,3 Ug.nd (FL), c-W .igand (KL, a,so known aa stem col. factor, 

SCF or Stl), Interleukin (e.g., .L-l, 114 M «** ^ ^ ' "* ^ 

,2) ^utocyte^lonystimtdaUngfa^ 

staling facer (OM-CSF), loukemia inhibitory factor (UF), MfP-.a, and erythropotefrn 
(EPOJ.Th.s.oompouudsmaybouseda.onoor.nanycnnin^on.menmunnoslom 

o.l.sareou^.apr.fe^non-.inutogmed.nmu.o.udesuuL-S.rnlWandrnSCF. 

Cono^nationrangeofUieseoompounds.ob.usrfu.ou.un^oanb. 

QetOT u,ned aoocding 10 know.edge w... known .n the art For example, a gouged 
^ of TPO is frona abou. 0., ng/mL lo about 5000 pgimL, more prefer. . fc- *- 

ng/mL. A preferred concentration range for each of FL and KL is from about 0.1 ^ 

lutO.ng/ntL.oaboutSOOn^andnrorepreferredfronrabou.rOng/mL.abou.OO 
ng/mL. Hyper 1L-6, a covalen, complex of.L-6 and ,1.6 recoptor may also be used m the 

CUlt0re ' Other molecules oau also be added to the culturo media, for instanco, 
adhes.on moieoulea, such as iibroueodon or RefroNectin™ (ootmnercMl, 

< , Wr , B ,»«----.|-*^-*'*-* , "™ , 
complex with collagen. 

V. TWT A Application* 

BSC'S aro the aotive oompoueut in bone marrow transplantation (BMT). The 
uaeofpunfieuHSCafranaplan.aaoppoaoJ.obonemanowprovidesmeadvanuagema. 

of harmhtl non-HSC ce,.s in .he bone marrow is avoided, ha the algous 
.umor or dise*ted colls back to the patient along with Ihebon. marrow. In allogomo 



27 



WO 2004/071443 



PCT/US2004/004007 



system. Thus, expansion of HSCs would make autologous and allogeneic HSC 
transplantation safer and more effective. 

The present invention provides methods for inhibiting HSC differentiation 
and promoting HSC expansion in vivo in a subject, e.g., a human subject engrafted with 
HSCs. Using HSC differentiation-inhibiting molecules identified in the present invention, 
these methods allow expansion of non-differentiated stem cells and increase the dose of 
HSCs either ex vivo or in vivo, thereby potentially allowing more rapid engraftment. The 
HSC differentiation-inhibiting molecules can be expressed in the engrafted HSCs. It can 
also be separately provided to the subject receiving the HSC graft, e.g., expressed from a 
vector introduced into the subject. In addition, the HSC differentiation-inhibiting molecules 
can also be administered to the subject as an expressed polypeptide, e.g., a growth factor. As 
a result, differentiation of the cells is blocked or slowed down, resulting in expansion of non- 
differentiated stem cells. 

Some methods of the invention provide ex vivo gene therapy for transplanting 
genetically modified HSCs cells into a subject. For example, vectors expressing an HSC 
differentiation-inhibiting polypeptide can be delivered to HSCs explanted from an individual 
subject, followed by reimplantation of the cells into a subject, usually after selection for cells 
that have incorporated the vector. Procedures for modifying host cells with an HSC 
differentiation-inhibiting polynucleotide (e.g., GAT A3) are described above. In addition, ex 
vivo cell transfection for diagnostics, research, or for gene therapy {e.g., via re-infusion of 
the transfected cells into the host organism) is well known in the art. For a review of gene 
therapy procedures, see Anderson, Science 256: 808-813, 1992; Nabel & Feigner, TDBTECH 
11: 211-217, 1993; Mitani & Caskey, TIBTECH 11: 162-166, 1993; Mulligan, Science 260: 
926-932, 1993; Dillon, TIBTECH 11: 167-175, 1993; Miller, Nature 357: 455-460, 1992; 
Van Brunt, Biotechnology 6: 1 149-1 1 54, 1 998; Vigne, Restorative Neurology and 
Neuroscience 8: 35-36, 1995; Kremer & Perricaudet, British Medical Bulletin 51: 31-44, 
1995; Haddada et al t in Current Topics in Microbiology and Immunology (Doerfler & 
Bohm eds., 1995); and Yu etal, Gene Therapy 1: 13-26, 1994). 

For therapeutic applications, the genetically modified HSC cells are 
maintained for a period of time sufficient for overexpression of HSC differentiation- 
inhibiting polypeptide. A suitable time period will depend inter alia upon cell type used and 
is readily determined by one skilled in the art. In general, genetically modified cells of the 
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invention may overexpress HSC differentiation-inhibiting polypeptide for the lifetime of the 
host cell. Preferably, for hematopoietic cells the time period will be in the range of 1 to 45 
days, more preferably in the range of 1 to 30 days, even more preferably in the range of 1 to 
20 days, still more preferably in the range of 1 to 10 days, and most preferably in the range 
of 1 to 5 days. 

Other than ex vivo gene therapy, vectors expressing an HSC differentiation- 
inhibiting polypeptide can also be delivered in vivo. This is carried out by administering to 
an individual subject the expression vector, typically by systemic administration (e.g., 
intravenous, intraperitoneal, intramuscular, subdermal, or intracranial infusion) or topical 
application. Methods for in vivo gene therapy are also well known in the art, e.g., as 
described in the literatures noted above. 

As described above, other than gene therapy, therapeutic expansion ofHSCs 
in a subject can also be achieved by directly applying an HSC differentiation-inhibiting 
polypeptide (or its fragment or functional derivative) to a subject. The subject can be 
simultaneously engrafted with HSCs. The subject can also be one that has not been subject 
to HSC transplant. Typically, in such applications, the HSC differentiation-inhibiting 
polypeptide (e.g., GATA3) is administered to the subject in a pharmaceutical composition. 
The pharmaceutical compositions typically comprise at least one active ingredient together 
with one or more acceptable carriers thereof. Suitable carriers for preparing the 
pharmaceutical compositions, appropriate dosages, and suitable routes of administration of 
the compositions can all be readily determined by following methods well known in the art. 
See, e.g., Oilman et al., eds., Goodman and Oilman's: The Pharmacological Bases of 
Therapeutics , 8th ed., Pergamon Press, 1990; Remington: The Science and Practice of 
Pharmacy, Mack Publishing Co., 20 th ed., 2000; Avis et al., eds., Pharmaceutical Dosage 
Forms- Parenteral Medications, published by Marcel Dekker, Inc., N.Y., 1993; and 
Lieberman et al., eds., Pharmaceutical Dosage Forms: Tablets, published by Marcel Dekker, 
Inc., N.Y., 1990. 

EXAMPLES 

The following examples are provided to illustrate, but not to limit the present 

invention. 
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Example 1 . Genes Upregulated in HumanHS Cs 

This Example describes RNA profiling of human hematopoietic stem cells 
and characterization of genes upregulated in the HSCs. All procedures and assays employed 
herein to study the human HSCs have been described in the art, e.g., as noted above. 

CD34 + cells were first isolated from blood of six normal human donors using 
magnetic beads. Flow activated cell sorting (FACS) was then used to purify CD34 4 Thy f 
(stem enriched) and CD34 + Thy" (stem depleted) cell populations. The two populations of 
cells (total 12 samples, 6 CD34 + Thy f and 6 CD34 + Thy ) were assayed for bioactivity with 
the CFC assay. RNA profiling (Thy + vs Thy ) was then carried out to identify genes 
differentially expressed in stem cells. Results of the profiling are shown in Table 1 . The 
data indicate that the upregulated genes encode proteins with diverse biochemical and 
cellular functions. 

In addition, genes upregulated in CD34 + Thy + HSCs from two different 
sources, bone marrow and peripheral blood, were compared for overlapping sequences that 
are enriched in HSCs from both sources. A total of 30 genes were found to have been 
upregulated in HSCs from both sources. An exemplary list of these genes is shown in Table 
2. Both HSC types contain transcription factors some of which are known proto-oncogenes 
(e.g., GATA3, HLF, Evil, PMX1, MN1, ATF3). 

Further, the results indicate that HSCs from peripheral blood, but not HSCs 
from bone marrow, are enriched in histones and inhibitory HLH transcription factors (ID1, 
ID2, and ID3). The data also suggest new cell surface markers for HSCs. Examples include 
5T4, EphA3, TNFSF3, EVI2b, DLK1. Several potential neuropeptides are also upregulated, 
including Vasopression (AVP), Oxytocin (OXT), and Vasodilators. 

Example 2. Inhition of HSC Differentiation Bv Over expressine an HSC Differentiation- 
Inhibiting Polypeptide 

The Example describes effects on HSC differentiation by constitutive 
expression of an HSC differentiation-inhibiting gene in CD34+Thyf cells using retroviral 
vectors. First, effect of overexpressing ID3 was analyzed with colony-forming cell (CFC) 
assay. Other assays such as cobblestone area forming cell (CAFC) assay and NOD/SCID 
(nonobese diabetic mice with severe combined immunodeficiency disease) repopulating cell 
assay can also be used in these analyses. These assays can be performed as described as 
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described above and are well known in the art (e.g., Kusadasi et al., Leukemia 14: 1944-53, 
2000; and Larochelle et al., Nature Medicine, 2: 1329-1337, 1996). 

Fig. 1 illustrates the schematic structure of the retroviral vectors used in the 
study Gene X in the figure denotes any of these HSC genes (e.g., ID3) to be examined. 
The vectors also express the green-fluorescence protein (GFP). When the GFP gene is 
transfected into or infected cells, the encoded GFP shines green under ultraviolet light and 
thus enables the detection of the transfected or infected cell in a simple manner. 

A vector harboring the HSC gene (e.g., ID3 or GATA3) was transfected into 
the CD34 + cells. Cells expressing the gene were sorted and assayed with the CFC assay. As 
shown in Fig. 2, ID3 over-expression increased the number of colony forming cells (e.g., 
primitive BFU-E colonies). This suggests enhanced HSC activity, indicating that 
differentiation of the stem cells has been slowed down. 

The HSC differentiation-inhibiting genes were also examined for their effects 
on HSC growth in liquid culture. The effect of GATA3 over-expression on human HSC 
differentiation was examined in liquid culture. Here, stem cells were transfected with the 
same vectors described above (which harbor the ID1 gene, GATA3 gene, or no HSC gene), 
and grown in liquid culture. CD34 + and GFP + cells were sorted. Expression of CD34 was 
monitored during the culture. Cells without transfection were used in a control analysis. 
The results indicate that, as compared to the control, 1D1 had no effect on differentiation of 
the CD34 + cells. However, expression of GATA3 significantly slowed the differentiation 
process as indicated by the rate of reduction of CD4 + cells. 

Example 3. Novgj Molecule MarW Expressed in mouse HSCs 

This Example describes use of RNA expression profiling to characterize 
purified mouse HSCs. Mouse HSCs were purified using a combination of antibodies to cell- 
surface markers. The following three cell populations were purified from murine bone 

described in Zhao et al., Blood 96: 3016-22, 2000; and Zhong et al., Blood 100: 



marrow as 
3521-6, 2002 



Cell type Immunophenotype HSC activity 

LT-HSC Lin,c-kit + ,Sca-l + ,CD38 + ,CD34- IX 
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Facilitator Cells Lin ,c-kit + ,Sca-l + ,CD38\ CD34 + 0.1X 
Progenitor Cells Lin>kit + ,Sca-l + ,CD38 + , CD34 + 0.1X 

Cells were purified from normal BL6 mice using flow cytometry. Three 
different preparations of sorted cells for each population were prepared and combined prior 
to the isolation of total RNA. The RNA was quantified using the Ribogreen fluorescence- 
based solution assay (e.g., as described in Jones et al., Anal Biochem 265: 368-74, 1998). 
lOng of each pooled RNA preparation was labeled in duplicates using the triple labeling 
procedure (as described, e.g., in Hrabovszky et al., J. Histochem. Cytochem. 43: 363-370,. 
1995) and hybridized to affymetrix U74A gene chips according to the manufacturer's 
instructions. Intensity values were obtained for each gene and sample using GeneChip 
software. These Average difference (AD) values were exported to a spreadsheet program 
and analyzed by first filtering for genes which are expressed above a threshold criteria (50 in 
at least two samples), and whose average for each population was expressed >2X or < 2X 
between any two cell populations and where ANOVA analysis showed a significant 
difference (P<0.01) between any two populations. 

Examples of genes upregulated in HSCs are shown in Table 3. The genes 
were analyzed for patterns using Genespring software and arranged by functional gene 
classification using GO ontogeny. Accession numbers or identification numbers from other 
public databases of these genes, as well as levels of up-regulation of these genes in HSCs as 
compared to non-HSCs, are also shown in the table- 
Example 4. Characterization of Genes Differentially Expressed in mouse HSCs 

To correlate stem cell activity of the three subsets with gene expression, a 
hypothetical stem cell activity pattern corresponding to the in vivo repopulating activity of 
the three subsets was generated and used for comparison of the normalized expression levels 
of each differentially expressed gene identified above. Principle Component Analysis 
(PCA) on the stem cell expression data was performed to identify gene expression patterns. 
This is an unsupervised computational method used to identify major patterns in diverse data 
types including gene expression data (Alter et al., Proc Natl Acad Sci USA 97:10101-10106, 
2000; and Hotter et al., Proc Natl Acad Sci USA 97:8409-8414, 2000). The correlation 
analysis of the gene expression patterns of the differentially expressed genes with stem cell 
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activity identified genes with highly significant (Pearson R >0.95) correlations. These genes 
are shown in Table 4. In addition to genes upregulated in HSCs, the analysis also identified 
genes whose expression negatively correlated with LTR HSCs (i.e., down-regulated 
expression). Examples of these genes are shown in Table 5. 

Some of the differentially expressed genes were further analyzed and 
classified according to their biological functions. The results are shown in Table 6. As 
shown in Tables 3, 4, and 6, the upregulated genes in mouse HSCs also encode proteins of 
diverse biological properties, similar to genes upregulated in the human HSCs. For example, 
a number of transmembrane proteins were enriched in the mouse HSCs, as exemplified in 
Table?. These molecules can be useful as novel surface markers for isolating HSCs. Some 
of transcription factors that are upregulated in the mouse HSCs are shown in Table 8. Their 
upregulated expression levels in the CD34- C D38 + HSCs relative to that in the facilitator cells 
(CD38'CD34 + ) and progenitor cells (CD34 + CD38 + ) are shown in Figure 3. 

The expression of several known transcription regulation factors was found to 
correlate positively with LTR HSC activity. These include Cited2, GATA3, Hdac3, Irf6, Jun 
B, Nmycl, Rnpsl, Xbpl, and Zfp292. Little is known regarding the role of these specific 
transcription factors in the control of HSC biology. These essential transcription factors 
could play an important role in regulating HSC development and differentiation. 

To determine if any of the differentially expressed transcription factors are 
themselves regulating transcription in LTR HSCs, we performed a search of putative 
upstream regulatory regions (10 kb upstream of start codons) of the interrogated genes for 
binding sites of the nine transcription factors. Statistical analysis of these results revealed 
that only the binding sites of GATA were significantly enriched (P<0.05) within the 
differentially expressed genes. Interestingly, this list contains a large fraction (20 of 52) of 
the genes whose expression positively correlated with HSC activity, suggesting the 
possibility that Gata may play an important role in the control of LTR HSC biology. A 
small number of gene (3 of 20) whose expression is negatively correlated with HSC activity 
also contained Gata binding sites, suggesting the possibility that low levels of Gata 
expressed in STR HSC may influence gene expression at later stages. 

To confirm the data from expression profiling, we performed semi- 
quantitative RT-PCR on total RN A extracted from the three BM subsets for three of the LTR 
HSC genes identified. These included the transcription factors Gata 3, Jun B, and the 
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thrombopoietin receptor c-Mpl. The results demonstrated that all three mRNAs are 
expressed at significantly higher levels in CD38 + CD34" cells compared to the other two 
subsets. 



Table 3. Genes Upregulated in Mouse HSCs 



Symbol 


Description 


RefSeq 


Swiss Prot Keywords 


nounon* 
HSC 


AU0449I9 


expressed sequence AU0449 1 9 


At \r\AAQ 1 O 

AUU44y 1 V 


Glycoprotein 
Immunoglobulin C region 
Immunoglobulin do mam 


70 7 
/y. i 




Kruppel-nke factor z (lung) 


1N1V1 UUOHj^ 


Activator DNA-binding 
Metal-binding Nuclear 
proicin ivcjjcai 
Transcription regulation 

7 1 n — ft n opt 


44.9 


Carl 


carbonic anhydrasc 1 


TsJM fin07Q0 


f uncp 7tnc 


36.8 


M.m.ZZUl.)4 


Mus musculus anti-HTV-I reverse 
transcriptase single-chain variable 
fragment inRNA, complete cds 


NA 


None 


30.1 


2010309G21Rik 


R1KJEN cDNA 2010309G21 gene 


none 


Immunoglobulin C region 
Immunoglobulin domain 


28.8 


NA 


M 80423! Mus castaneus IgK. chain 

op«p P-rMJinn 3 miH IrA<^(C\ 3971 

/gb=M80423/gi=196865 
/ug=Mm.46804 /len=323 mRNA 


M80423 


None 


20.9 




Fragilis 


NM 025378 


None 


17.1 


Smocl 


SPARC related modular calcium 
binding 1 


NM 022316 


None 


15.8 


583041 3F0RRik 


RIKEN cDNA 583041 3 E08 gene 


NM 029083 


None 


14.9 


5830431A10Rik 


RIKJEN cDNA 5830431A1O gene 


none 


None 


14.4 


AI325941 


expressed sequence AI325941 


AI325941 


None 


14.2 


Cdknlc 


cyclin-dependent kinase inhibitor 1C 
(P57) 


NM 009876 


Alternative splicing Cell 
cycle 


14.1 


Lisch7 


liv^r-cnwifir hHf H-7in fran^crtnrifin 

factor 


none 


None 


13.9 


AW108012 


expressed sequence AW 1080 12 


AW108012 


None 


13.8 


AkricB 


aldo-keto reductase family 1, member 
CI3 


NM 013778 


None 


13.3 


0910001 L24Rik 


RIKEN cDNA 0910001L24 gene 


NM 022419 


None 


12.7 


AI842353 


expressed sequence A1842353 


AI842353 


None 


11.7 


Tgm2 


transglutaminase 2, C polypeptide 


NM 009373 


Acyltransferase Calcium- 
binding Transferase 


11.4 


Nckapl 


NCK-associated protein 1 


none 


Transmembrane 


11.3 


Serpina3g 


serine (or cysteine) proteinase 
inhibitor, clade A, member 3G 


none 


None 


11.3 


I700008C22Rik 


RIKEN cDNA 1700008C22 gene 


none 


None 


10.4 


Nmycl 


neuroblastoma myc-related oncogene 


NM 008709 


DNA-binding Nuclear 
protein Phosphorylation 
Proto-oncogene 


10.4 


Zfhxla 


zinc finger homeobox la 


NM 011546 


Activator DNA-binding 
Homeobox Metal-binding 
Nuclear protein Repeat 
Repressor Transcription 
regulation Zinc-finger 


10.4 


H2-EM 


histocompatibility 2, class II antigen 
Ebeta 


NM 010382 


Glycoprotein MHC II 
Signal Transmembrane 


10.0 


AU044919 


expressed sequence AU044919 


AU044919 


Glycoprotein 
Immunoglobulin C region 
Immunoglobulin domain 


9.9 
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z 


uanylate nucleotide binding protein 


(M 010260 b 


one — 


9.5 


Gbo2 _2 

I 


;amma-aminobutyric acid (GABA-B) 
eceptor, 1 . - 


/ 
c 
r 
I 
I 

sJM 019439 


Jternative splicing Coiled 
oil G-protein coupled 
eceptor Glycoprotein 
'ostsynaptic membrane 
Repeat Signal 
rransmembrane 


9.5 


Gabbrl ! 

] 


3NA segment, Chr 8, ERATO Doi 

S9. expressed 2 


wne 1 


Mnnr 


92 


D8Ertd69e 
Gata3 

C130052ll2Rik 


GATA binding protein 3 

pt* FN r.nNA CI30052I12 gene 


NM 008091 
NM 146047 
NM 029555 


f\cuvaior L^ixr\-utiiMi*'tp 

Nuclear protein T-cell 

Transcription regulation 

7inr.-fmger 

None 

None 


9.1 
8.7 
8.6 


061002511 9Rik 
TcflS 


RIKEN cDNA 0.610025119 gene 
transcription factor 1 5 

histocompatibility 2, class H antigen 

A, alpha 


NM 009328 
NM 010378 


None 

3D-structure Glycoprotein 
MHC 11 Signal 
Transmembrane 


8.6 

8.5 


H2-Aa 

Tall 


T-cell acute lymphocytic leukemia 1 

my oz en in 1 


NM 011527 
NM 021508 


Chromosomal 
translocation 
Differentiation DNA- 
binding Phosphorylation 
Proto-oncogene 
Transcription regulation 
None 


8.3 
7.9 
7.4 


Myozl 

493042U07Rik 


RIKEN cDNA 493042U07 Rene 

immunoglobulin heavy chain 6 
(heavy chain of IgM) — 


none 
none 


None 

Alternative splicing 
Glycoprotein 

Immunoglobulin C region 
Immunoglobulin domain 
Transmembrane 


7.3 


101-6 


Homeo box B5 


NM 008268 


Developmental protein 
DNA-binding Homeobox 
Nuclear protein 
Transcription regulation 


7.3 


Hoxb5 


procollagen, type IX. alpha 1 


NM 007740 


Alternative splicing 
Cartilage Collagen 
Connective tissue 
Extracellular matrix 
Glycoprotein 
Hydroxylation Repeat 
Signal 


7.2 


Col9al 
Meisl 


myeloid ecotropic viral integration 
sitel 


NM 010789 


None 
None 


7.1 
7.0 


Elal 


elastase 1 . pancreatic 

hippocampus abundant gene transcript 

1 


none 

NM 008246 


None 


7.0 


HiatI 
Fah 


fumarylacetoacetate hydrolase 


NM 010176 


Hydrolase Phenylalanine 
catabolism Tyrosine 
catabolism 
None 


6.9 
6.7 


Cypfl3 
NA 


cytochrome P450 CYP4F13 
:Mus museums transcription factor 
PBX3b (PBX3b) mRN A, complete 
n*4c inA^=l\ 1 8 U73Web=AF020200 

COS r COS — \X 10,1 1 ij j 1 v 

/gi=2432016 /ug=Mm.7331 
Aen=2467 mRNA 


NM 130882 

AF020200 
NM 152839 


None 

Glycoprotein Signal 


6.5 
6.3 


iri 

NA 


immunoglobulin joining chain 

:AV336991 Mus musculus cDN A, 3 
end/clone-6332407A01 
/clone_end=3 /gb-AV336991 
/gi=6377043 /ug=Mm.99212 
/len=201 /NOTE=replacementfor 
probe set(s) 100264 f atonMG- 


AV336991 


None 


6.2 
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U74A mRNA 








Ola2b 


cytotoxic T lymphocyte-associated 
protein 2 beta 


none 


Repeat Signal T-cell 


6.1 


Serpinb6 


serine (or cysteine) proteinase 
inhibitor, clade B, member 6 


NM 009254 


Serine protease inhibitor 
Serpin 


5.8 


Mm.29940 


ESTs 


NA 


None 


5.8 


AU043625 


expressed sequence AU043625 


NM 133910 


None 


5.8 


CoWaL 


procollagen, type IV, alpha 1 


none 


Basement membrane 
Collagen Connective tissue 
Extracellular matrix 
Glycoprotein 
Hydroxylation Repeat 
Signal 


5.6 


Igh-4 


immunoglobulin heavy chain 4 
(serum IgGl) 


none 


Alternative splicing 
Glycoprotein 
Immunoglobulin C region 
Immunoglobulin domain 


5.5 


Siat6 


siatyltransferase 6 (N- 
acetyllacosaminide alpha 2,3- 
sialyltransferase) 


NM 009176 


Glycoprotein 
Glycosyltransferase Golgi 
stack Signal-anchor 
Transferase 
Transmembrane 


5.4 




immunoglobulin kappa chain, 
constant region 


none 


None 


5.4 


Sdpr 


Serum deprivation response 


NM 138741 


None 


5.4 


Duspl 


dual specificity phosphatase 1 


NM 013642 


Cell cycle Hydrolase 


5.3 


Cited2 


Cbp/p300-interacting transactivator, 
with Glu/ Asp-rich carboxy-terminal 
domain, 2 


NM 010828 


Alternative splicing 
Nuclear protein 


52 


Epor 


erythropoietin receptor 


NM 010149 


r?lvronrot<*in Recentor 
Signal Transmembrane 


5.1 


Mra.200980 


Mus museums, Similar to 
translocation protein 1 , clone 
IMAGE:5347105, mRNA, partial cds 


NA 


None 


5.0 


AtO 


activating transcription factor 2 


none 


Activator Alternative 
splicing DNA-binding 
Metal-binding Nuclear 
protein Phosphorylation 
Transcription regulation 
Zinc-finger 


5.0 


Ccncl 


cyclin El 


NM 007633 


Cell cycle Cell division 
Cyclin Nuclear protein 
Phosphorylation 


5.0 


Mllt3 


myeloid/lymphoid or mixed lineage- 
leukemia translocation to 3 homolog 
(Drosophila) 


NM 027326 


None 


4.9 


D5Ertd40e 


DNA segment, Chr 5, ERATO Doi 
40, expressed 


none 


None 


4.9 


Zfp216 


zinc finger protein 2 1 6 


NM 009551 


None 


4.8 


Syp 


synaptophysin 


NM 009305 


Calcium-binding 
Glycoprotein Nerve 
Phosphorylation Repeat 
Synapse Synaptosome 
Transmembrane 


4.8 


Nedd4 


neural precursor cell expressed, 
developmentally down-regulted gene 
4 


NM 010890 


Ligase Repeat Ubiquitin 
conjugation 


4.7 


Pbxl 


pre B-cell leukemia transcription 
factor 1 


NM 008783 


None 


4.7 


6330407GllRik 


RIKEN cDNA 6330407G1 1 gene 


NM 023423' 


None 


4.6 


Ashl 


absent, small, or homeotic discs 1 
(Drosophila) 


NM 138679 


None 


4.5 


Lrrnp 


lymphoid-restricted membrane 
protein 


NM 008511 


None 


4.5 


Casp8ap2 


caspase 8 associated protein 2 


NM 011997 


None 


4.5 
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lus musculus, clone 

^AGE:4952607.mRNA N 


A M 


one 


4.5 


Mm.30163 ? 


athepsin L _ - 


G 
L 

1M 009984 _E 


lycoprotein Hydrolase 
ysosome Signal Thiol 
rotease Zymogen 


4.5 


Ctsl c 
s 
( 

Sfpq £ 
2010004A03Rik I 


plicing factor prolme/glutamine rich 
polypyrimidine tract binding protein 

tssociated) ? 

IIKEN cDNA 7-010004A03 gene r 
;arbonic anhydrase 2 * 


JM 023603 . r 
tone „ __ 1 
>JM 009801 1 


4 one 
<one 

_yase Zinc 


4.4 
4.3 
4.2 
4.1 


Car2 « 

Mm 22896 

AIS73938 

Vasp 


ESTs 1 
expressed sequence AI573938 ! 

vasodilator-stimulated phosphoprotein _ 
expressed sequence AA40845 1 


vlA * 

• 

lone J 
none 

AA408451 


^one 

^one . 

A.ctin-binding 

, nospnoryianuH 

None 


3.9 

3.9 
3.7 
3.6 


AA408451 

Pftkl 


PFTAIRE protein kinase 1 
TGFB inducible early growth 


NM 011074 
NM 013692 


None 

DNA-binding Metal- 
binding Nuclear protein 
Repeat Repressor 
Transcription regulation 
Zinc-finger 


3.6 


Tiee 


response 

immunoglobulin kappa chain variable 


none 


Immunoglobulin C region I 
Immunoglobulin domain 


3.6 


Igk-V28 
Mm. 1806 


28 (V28) 

Mus musculus, Similar to K1AA1404 
protein, clone IMAGE: 5252426, 
mRNA, partial cds 


NA 
NA 


None 1 
None 


i 

3.5 
3.5 




ESTs — 
CCR4 carbon catabolite repression 4- 
like (S. cerevisiaei 


none 


Biological rhythms 


3.5 


Ccm41 
Cpo 


coproporphyrinogen oxidase _ 


NM 007757 
NM 019738 


Heme biosynthesis Iron 1 

Mitochondrion 

Oxidoreductase Porphyrin 

biosynthesis Transit 

peptide 

None 


3.5 
3.5 




nuclear protein 1 . 

similar to gene overexpressed in 
astrocytoma FHomo sapiens] 


NA 


Nnnp. 


! 3.4 


Mm.5510 
Rab33b 


RAB33B, member of RAS oncogene 
family . , 


NM 016858 


Golgi stack GTP-binding 
Lipoprotein Prenylation 
Protein transport 
None 


3.4 


Qd'MOfi'iL^Rik 


RIKEN cDNA 9430065L19 Eene 
progesterone receptor 


NM 146083 
NM 008829 


DN A-binding Nuclear 
protein Receptor Steroid- 
binding Transcription 
regulation Zinc-finger 


4— 3,4 


1JOC218490 


similar to Transcription factor B 1 F3 
(RNA polymerase B transcription 
factor 3) 


NM 145455 


Alternative splicing 
Nuclear protein 
Transcription regulation 
None 


1 3.4 
1 3.3 


4930434H03Rik 


RIKEN cDNA4930434H03 eene 
Actinin alpha 3 


none 

NM 013456 


Actin-Dinaing iviuuigcnc 
family Repeat 


-J— 33 


Actn3 

Mm.202311 


Mus museums, clone 
IMAGE:1379624. mRNA T partial cds 


NA 

NM 019440 


GTP-binding Lipoprotein 
Membrane Multigene 
family Palmitate 
Transducer 
None 


33 

r 3.3 


Gtpi 


intprferon-e induced (J lipase 

N-acetyltransferase 2 (arylamine N- 
acetyltransferase) . 


NM 010874 


Acyltransferase Multigene 
family Polymorphism 
Transferase 


1 3.3 


Nat2 

Eva2 

lU0037NQ9Rik 
5Q33414P02Rik 
Mm.26147 


eves absent 2 hnmolog (Drosophila) 
RIKEN cDNA 1 1 1Q037N09 gene 
RIKEN cDNA <>Q33414D02 gene 
ESTs 


none 
none 

NM 026362 
NA 


Alternative splicing 

Developmental protein 

MuUigene family 

None 

None 

None 


! 3.3 
32 

i 3.1 

1 3.1 
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114 


interleukin 4 


NM 021283 


B-cell activation Cytokine 
Glycoprotein Growth 
factor Signal 


3.1 


Ubapl 


ubiquitin-associated protein I 


NM 023305 


None 


3.1 


Acoxl 


acyl-Coenzyme A oxidase 1, 
Dalmitoyl 


NM 015729 


FAD Fatty acid 
metabolism Flavoprotein 
Oxidoreductase 
Peroxisome 


2.9 


Ccl5 


chemokine (C-C motif) ligand 5 


NM 013653 


Chemotaxis Cytokine 
Inflammatory response 
Signal T-cell 


2.9 


AW457192 


expressed sequence AW457192 


NM 134084 


Cyclosporin Isomerase 
Mitochondrion Multigene 
family Rotamase Transit 
peptide 


2.9 


26100I6KlIRik 


RIKEN cDNA 2610016K1 1 gene 


none 


None 


2.8 


Fzd4 


frizzled homolog 4 (Drosophila) 


NM 008055 


Developmental protein G- 
protein coupled receptor 
Glycoprotein Multigene 
family Signal 
Transmembrane 


2.8 


Pla2g4a 


phospholipase A2, group IVA 
(cytosolic, calcium-dependent) 


NM 008869 


Calcium Hydrolase Lipid 

degradation 

Phosphorylation 


2.8 


Scin 


scinderin 


NM 009132 


None 


2.7 


NA 


AV239653 Mus musculus cDNA, 3 
end/done=4732435F04 
/clone_end=3 /gb=AV239653 
/gi-6192I60/ug=Mm.883I3 
/len=214 /NOTE=replacement for 
probe set(s) 964 1 l_f_at on MG-U74A 
mRNA 


AV239653 


None 


2.7 


Tcfl2 


transcription factor 12 


NM 011544 


Alternative splicing 
Developmental protein 
DNA-binding Nuclear 
protein Transcription 
regulation 


2.7 


Madh7 


MAD hornolog 7 (Drosophila) 


NM 008543 


Alternative splicing 
Multigene family 
Transcription regulation 


2/7 


Gem 


GTP binding protein (gene 
overexpressed in skeletal muscle) 


NM 010276 


GTP-binding Membrane 
Phosphorylation 


2.7 


Tpml 


tropomyosin 1, alpha 


NM 024427 


3D-structure Acetylation 
Alternative splicing Coiled 
coil Multigene family 
Muscle protein 
Phosphorylation Repeat 


2.7 


Mapl7 


membrane-associated protein 17 


NM 026018 


None 


2.7 


Dcx 


doublecortin 


NM 010025 


Neurogenesis Neurone 
Phosphorylation Repeat 


2.7 


Igk-V28 


immunoglobulin kappa chain variable 
28 (V28) 


none 


Immunoglobulin C region 
brununoglobulin domain 


2.6 


Rnfll 


ring finger protein 1 1 


NM 013876 


None 


2.6 


Nfix 


nuclear factor I/X 


NM 010906 


None 


2.6 


Lin7c 


lin 7 homolog c (C. elegans) 


NM 011699 


None 


2.5 


Cln3 


ceroid lipofuscinosis, neuronal 3, 
juvenile (Batten, Spielmeyer-Vogt 
disease) 


NM 009907 


Glycoprotein Lysosome 
Transmembrane 


2.5 


Hhex 


heraatopoietically expressed 
homeobox 


NM 008245 


Developmental protein 
DNA-binding Homeobox 
Nuclear protein 


2.5 


Gabl 


growth factor receptor bound protein 
2-associated protein 1 


NM 021356 


None 


2.5 


None 


none 


none 


None 


2.5 


Kcnj3 


potassium inwardly-rectifying 


NM 008426 


Ion transport Ionic channel 


2.5 
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c 


,hannel, subfamily J, member 3 


I 

1 

I 


, otassium transport 
fransmembrane Voltage- 
lated channel 




Cradd 


2ASP2 and RIPK1 domain 

containing adaptor with death domain _ 


MM 009950 

NA 1 


kpoptosis 
None 


2.5 
2.4 


Mm.299I4 
Fos 

Mm.24247 
4930472G13Rik 


RSTs ] 

FBJ osteosarcoma oncogene 
RDCEN cDNA 4930472G13 gene 


NM 010234 
NA 

NM 029447 


DNA-binding Nuclear 

protein Phosphorylation 

Proto-oncogene 

None 

None 


2.4 
2.4 


Ormdl3 

Umpk 

Cree 


ORMl-like3 (S. cerevisiae) 

uridine monophosphate kinase 
cellular repressor of E 1 A-stimulated 




NM 025661 
none 

>JM fll 1R04 


None 

Kinase Transferase 

None 
None 


2.4 
2.4 

2.4 
2.3 


Utrn 

Mm.27769 


utrophin 

ESTs, Weakly similar to RI1CEN 
cDN A 06 1 00 11 El 7 [Mus musculus] 
[M musculusl 


none 
NA 

NM 018738 


None 
None 


2.3 
2.3 


¥p 


interferon gamma induced G TPase 
arginase type H 


NM 009705 


Arginine metabolism 
Hydrolase Manganese 
Mitochondrion Transit 
peptide Urea cycle 


2.3 


Arp2 
Pklr 


pyruvate kinase liver and red blood 
cell 


NM 013631 


Alternative splicing 
Glycolysis Kinase 
Magnesium Multigene 
family Phosphorylation 
Transferase 
None 


2.2 
22 


1810010A06Rik 
Mm.532 


RIKEN cDNA 181OO1OA06 gene 
ESTs, Weakly similar to 
lysophospholipase 1 ; phospholipase 
la; lysophophotipase 1 [Mus 
musculus] fM.muscuius] 


NM 026921 
NA 


None 


22 


Vamp5 

0710001 003Rik 


vesicle-associated membrane protein 
5 

RIKEN cDNA 0710001003 gene 


NM 016872 
NM 146094 


Multigene family 

Myogenesis Signal-anchor 

Transmembrane 

None 

None 


22 
22 
22 


261OO03J05Rik 
Tdell 

Scrpinfl 


RrKFM r.ONA 2610003J05 gene 
tumor differentially expressed 1, like 
serine (or cysteine) proteinase 
inhibitor, clade F), member 1 


none 

NM 019760 
NM 011340 


None 

Glycoprotein Serpin Signal 
None 


22 

2.1 
2.1 


Scotui 
G3bp2 

1190002H23Rik 


scotin gene 

Ras-GTPase-activating protein 
(GAP<120>) SH3-domain binding 
protein 2 

RIKEN cDNA 1 190002H23 gene 


NM 025858 

NM 011816 
NM 025427 
>JM 010940 


None 
None 
None 


2.1 
2.1 
2.1 


Nsccnl 
Tgoln2 

Ywhae 


nnn-iftiprtive cation channel I 
trans-golgi network protein 2 
tyrosine 3-monooxygenase/tryptophan 
5-monooxygenase activation protein, 
epsilon polypeptide 


1NIV1 V I \J yr\J 

NM 009444 
NM 009536 


None 

None 
None 


2.1 

2.1 
2.1 


463140801 IRik 
Pou2afl 


RIKEN cDNA 463 1 408O1 1 gene 
POU domain, class 2, associating 
factor 1 


none 

NM 011136 


Nuclear protein 
Transcription regulation 


2.1 


Mm.220953 


Mus musculus, clone 
LMAGE:4206769 f mRNA 


NA 


None 


2.1 




caspase6 


NM 009811 


Apoptosis Hydrolase Thiol 
protease Zymogen 


2.0 


Casp6 


none 


none 


Glycoprotein 
Immunoglobulin C region 
Immunoglobulin domain 


2.0 


None 

Nr4al 


nuclear receptor subfamily 4, group 
A, member 1 


NM 010444 


DNA-binding Nuclear 
protein Phosphorylation 
Receptor Transcription 


2.0 
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regulation Zinc-finger 




1700023OllRik 


RBCEN cDNA 1 700023 01 1 gene 


NM 029339 


None 


2.0 


Brca2 


breast cancer 2 


NM 009765 


Polymorphism Repeat 


2.0 


H2-T22 


histocompatibility 2, T region locus 
22 


NM 010397 


None 


2.0 



Table 4 Genes With Upregulated Expression and Correlated Stem Cell Activity 



Symbol or Acc. No. 


Gene Description or similarity 
to known proteins 


Corrrelation to stem cell 


UnigeneNo. 


Rnpsl 


ribonucleic acid binding 
protein SI 


1.000 


Mm.1951 


Junb 


Jun-B oncogene 


1. 000 


Mm. 11 67 


Hdac3 


hi stone deacetylase 3 


1.000 


Mm.20521 


IrflS 


interferon regulatory factor 6 


1.000 


Mm.4179 


Gata3 


GATA binding protein 3 


0.997 


Mm.606 


Xbpl 


X-box binding protein 1 


0.993 


Mm.22718 


Cited2 


Cbp/p300-interacting 
transactivator, with Glu/ Asp- 
rich carboxy-terminal domain, 
2 


0.992 


Mm.9524 


Nmyc i 


neuroblastoma myc-related 
oncogene 1 


0.986 


Mm. 16469 


Zfp292 


zinc finger protein 292 


0.975 


Mm.38193 ^ 


Bdkrbl 


bradykinin receptor, beta 1 


1.000 


Mm.57076 


Mapl7 


membrane-associated protein 
17 


0.995 


Mm.30181 


Ormdl3 


ORMl-like 3 (S. cerevisiae) 


0.990 


Mm. 180546 


Fzd4 


frizzled homo log 4 
(Diosophila) 


0.988 


Mm.68712 


Lgi4 


leucine-rich repeat LGI 
family, member 4 


0.961 


Mm.1662 


Bdkrbl* 


bradykinin receptor, beta 1 


1.000 


Mm.57076 


Socs2 


suppressor of cytokine 
signaling 2 


0.996 


Mm.4132 j 


Fzd4* 


frizzled homolog 4 
(Drosophila) 


0.988 


Mm.687I2 


Kit* 


kit oncogene 


0.961 


Mm.4394 


InppSd 


inositol polyphosphate-5- 
phosphatase D 


0.958 


Mm.15105 


Fbxo9 


f-box only protein 9 


1.000 


Mm28584 


Nedd4 


neural precursor cell 
expressed, developmental^ 
down-regulted gene 4 


0.993 


Mm.16553 


Rnfll 


ring finger protein 1 1 


0.992 


Mm.25228 


Ianl 


immune associated nucleotide 
1 


0.999 


Mm.28395 


Iigp 


interferon-inducible GTPase 


0.997 


Mm.29008 


lfi47 


interferon gamma inducible 
protein 


0.984 


Mm.24769 




T-cell specific GTPase 


0.994 


Mm. 15793 


Igtp 


interferon gamma induced 
GTPase 


0.993 


Mm.858 


Gtpi 


interferon- g induced GTPase 


0.989 


Mm.33902 


Serpinboa 


serine (or cysteine) proteinase 
inhibitor clade B, member 6a 


0.996 


Mnx2623 


Serpina3g 


serine (or cysteine) proteinase 
inhibitor, clade A, member 3G 


0.987 


Mm.15085 


Camk2b 


calcium/calmodulin-dependent 
protein kinase 11, beta 


0.999 


Mm.4857 


Gabl 


growth factor receptor bound 


! 0.997 


Mhl24573 
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Gabarapll 



Mtmrl3 



Mt2 



Car2 



Cdknlc 



Lcn7 



A430017F18 



AU044919 



2310075M17Rik 



E112 



LOC207685 



23 10061 104Rik 



5830431 A lORik 



2700007P21Rik 



B930086G17 



2410166I05Rik 



D10Ertd749e 



2210023F24Rik 



Riken 4237666 



6230421 P05Rik 



463140801 IRik* 



1 1 10054N06Rik* 



protein 2-associated protein 1_ 



gamma-aminobutyric acid 
(GABA(A)) receptor- 
associated protein-like 1 



myotubularin related protein 
13 . 



metallothionein 2 



carbonic anhydrase 2 



cyclin-dependent kinase 
inhibitor 1C(P5T> 



Iipocalin7 



No 



No significant similar gene 



Similar to S3543 GTP-binding 
protein (90%) 



Eleven-nineteen lysine-rich 
leukemia gene 2 



Hypothetical protein 



No similar gene 



Contain Corl/Xlr/Xinr 
conserved region 



Unknown protein 



No similar gene 



Hypothetical protein 



Similar to ZW10 interacting 
protein-1 



Contain B-box Zn-fmger and 
SPRY domain 



No significant similar gene 



No similar gene 



No significant similar gene 



Unknow protein with Ankyrin 
repeat 



0.997 



0.996 
0.999 



0.995 



0.986 



0.999 



1.000 



1.000 



0.999 



0.998 



0.998 



0.998 



0.997 



0.997 



0.992 



0.989 



0.986 



0.983 



0.978 



0.978 



0.964 



0.960 



Mm.14638 



Mm.200250 



Mm.147226 



Mm.U86_ 



Mm.168789 



Mm.15801 



Mm.44883 



Mm. 14438 



Mm.196592 



Mm.21288 



Mm.38214 



Mm.5624 



Mm.1148 



Mm.3587 



Mm.24738 



Mm.30153 



Mm38994 



Mm.5510 



Mm.276231 



Mm.26147 



Mm.2935 



Mm.15351 



Table 5 Genes down-regulated in CD38+CD34- Cells 



Symbol or Ace. No. 


Description 


Correlation to SC 
activity 


Unigene No. 


Satbl 


Special AT-rich sequence 
binding protein 1 


0.955 


Mm.4381 


Ptpro 


Protein tyrosine phosphatase, j 
receptor type. O 


0.999 
0.988 


Mm.47I5 
Mm.1461 


Sell 
Ccl9 


Se.tectm lymphocyte 
Chemokine (C-C motif) ligand 
9 


0.988 
0.988 


Mm.2271 
Mm.22171 


Cnn3 


Calponin 3, acidic 
Lectin, galactose binding, 
soluble 3 


0.971 


Mm.2970 


Lgals3 

Mki67 


Antigen identified by 
monoclonal antibody Ki 67 


0.998 
0.977 


Mm.4078 
Mm.4383 


Binl 
Sult4al 


Bridging integrator 1 
Sulfotransferase family 4A, 
member 1 


1.000 
0.996 


Mm.20451 
Mm.18603 


Hdc 

AI132321 


Histidine decarboxylase 
Contain phospholipase D. 
active site motif 


-1.000 
-1.000 


Mm.203915 
Mm.23526 


2610036L13Rik 
BC018347 


No similar gene 

Similar to translaUon-initiation 
factor IF-2 


-1.000 

! -1.000 


Mm.154309 
Mm.21579 


X90778 
AW060549 


Similar to Histone H2B 
Similar to Retrovirus-related 
POLpolyprotein _ 


-0.999 


Mra.29177 
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X67863 


Similar to Octapeptide-repeat 
protein T2 


-0.995 


Mm.35868 


XI 5378 


Similar to Myeloperoxidase 
and Eosinophil peroxidase 
precursor 


-0.915 


Mra.4668 


Plac8 


Uncharacterized Cys-rich 
domain containing protein 


-0.960 


Mm.34609 


D13Ertd275e 


Hypothetical protein 


-0.952 


Mm.21231 



Table 6. Cassification and Characterization of Genes Upregulated in Mouse HSCs 



Class 


Name 


Sequence Description 


Sequence 
Code 


Unigene 
Code 


Protein 
ID 


Apoptosis 


Birc5 


baculoviral IAP repeat-containing 5 


101521 


Mm.8552 


070201 


Cell cycle 


Spin 


spindlin 


99563 


Mm.42193 




Chromosomal 


Btgl 


M.musculusbtgl mRNA. 


93104 




P3I607 


Chromosomal 


Calm2 


Mus rausculus calmodulin synthesis (CaM) 
cDNA, complete cds. 


93293 




P02593 


Enzyme 


Ctsl 


cathepsin L 


101963 


Mm.930 


P06797 


Enzyme 


Gdil 


guanosine diphosphate (GDP) dissociation 
inhibitor 1 


97313 


Mra.205830 


P50396 


Enzyme 


Hadh2 


hydroxysteroid (17-beta) dehydrogenase 10 


101045 


Mm.6994 


008756 


Enzyme 


Mt2 


Mouse metallothionein II (MT-II) gene. 


101561 




P02798 


Enzyme 


Pnp 


purine-nucleoside phosphorylase 


93290 


Mm. 1 7932 


P23492 


Enzyme 


Vdul 


Vhlh-interacting deubiquitinating enzyme I 


160710 


Mm.24383 




Kinase 


Csnklc 


casein kinase I, epsilon 


97925 


Min.30199 


090U13 


Kinase 


Nme3 


expressed in non-metastatic cells 3 


94981 i 


Mm.27278 




Lectin 


Lgals9 


lectin, galactose binding, soluble 9 


103335 


Mm.18087 


008573 


Metabolism 


Aldhlal 


aldehyde dehydrogenase family 1 , 
subfamily Al 


100068 


Mm.4514 


P24549 


Metabolism 


Aldhla7 


aldehyde dehydrogenase family 1, 
subfamily A7 


94778 


Mm.14609 


035945 


Metabolism 


Coo 


coproporphyrinogen oxidase 


98505 i 


Mm.35820 


P36552 


Metabolism 


Cpo 


coproporphyrinogen oxidase 


98506 r 


Mm.35820 


P36552 


Metabolism 


Echl 


enoyl coenzyme A hydratase 1 , peroxisomal 


93754 


Mm.21I2 


035459 


Metabolism 


Mtcpl 


M.musculus MTCP-1 gene. 


103043 




061908 


Nuclear 


Rbmx 


RNA binding motif protein, X chromosome 


97848 


Mm.28275 


O9R0Y0 


Nuclear 


Snrpa 


small nuclear ribomicleoprotein polypeptide 
A 


100101 


Mm.4633 


Q62189 


Secreted 


lap 


mtracisternal A particles 


97181 f 


Mm.212712 


P03975 


Secreted 


Tff2 


Mus musculus spasmolytic polypeptide 
(mSP) gene, complete cds. 


93302 






Signaling 


Gnb4 


guanine nucleotide binding protein, beta 4 


93949 


Mm.9336 


P29387 


Signaling 


Tsc2 


tuberous sclerosis 2 


97953 g 


Mm.30435 


061037 


Structural 


Fscnl 


fascin homolog 1 , actin bundling protein 
(Strongylocentrotus) purpuratus) 


92838 


Mm.13194 


Q61553 


Transcription 


Irfl 


Interferon regulatory factor 1 


102401 


Mm. 1246 


P15314 


Transcription 


Cited2 


Cbp/p300-interacting transactivator, with 
Glu/ Asp-rich carboxy-terminal domain, 2 


101973 


Mm.9524 


035740 


Transcription 


Ncorl 


nuclear receptor co-repressor 1 


101536 


Mm.88061 


060974 


Transcription 


Sox6 


SRY-box containing gene 6 


92726 


Mm.4656 


P40645 


Transcription 


Hhex 


Mus musculus Hex(Prh) gene, exon4 and 
complete cds. . 


98408 


Mm33896 


Q9R1X2 


Transcription 


Trim30 


tripartite motif protein 30 


98030 


Mm.3288 


PI 5533 


Transcription 


Tieg 


TGFB inducible early growth response 


99602 


Mm.4292 


O8909I 


Transcription 


KID. 


Kruppel-like factor 2 (lung) 


96109 


Mm.26938 


Q60843 


Transcription 


Eif4a2 


eukaryotic translation initiation factor 4A2 


93089 


Mm. 16323 


PI0630 


Transcription 


H2a-615 


Mus musculus histone H2a3-61 5 (H2a- 
615), and histone H3.2-615 (H3-6I5) 
genes, complete cds. 


93068_r 




P20670 


Transcription 


Nfe2I2 


Mus musculus p45 NF-E2 related factor 2 
(NRF2) gene, exon 2 to exon 5 and 


92562 




Q60795 
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Transcription f 

Transcription J 


c 

lil . f 
AcmdS r 
( 


omplete cds. 

riend leukemia integration 1 : 

nini chromosome maintenance deficient 5 1 
S. cerevisiae) 


4698 

00156 I* 
00708 * 


Am.\ 19781 I 
4m.5048 I 

vlm.18516 I 


•26323 ! 
'49718 

>06351 


Transcription 1 
Transcription 1 

Transcription 
Transcription 


13f3b I 
*.ev3I I 
J 

Hoxb5 

Pbxl J 


-13 histonc, family 3B 

IEV3-Uke, catalytic subunit of DNA 

rclvmerase zeta RAD54 like (S. cerevisiae) _ 

iomeoboxB5 

pre B-cell len^mia transcription factor 1 


103457 I 

103666 1 
94804 1 
93324 


vim^io/ y 

vtm.207 1 

VlmJ22124o 

Mm.18571 


^61493 ( 
P09079 
P23950 \ 


Transcription 
Transcription 
Transcription 
Transcription 


Zfp3611 
Myb 
Sp4 
Idb2 


zinc fin per protein 36, C3H type-like 1 

myeloblastosis oncogene . _ _ 

trans-acting rranscription factor 4 

Mus musculus helix-loop-helix protein Id2 

gene, 3' region, . 


92644 s 
92992 i 
93013 

160447 


Mm.1202 
Miru5073 

Mm.3792 


P06876 

mft ion 1 

P70187 


Transmembrane 
Transmembrane 


Hiatl 


hippocampus abundant gene transcript 1 
mouse gene for the constant part of gamma- 
| jmrniinnfiloblin. _ - 


101870 
101054 


Mm.7043 


P01869 
P04441 1 


Transmembrane 
Transmembrane 


Ii 

H2-Aa 


la-associated invariant chain 

histocompatibility 2, class 11 antigen A, 
alpha . _ 


92866 
103997 


Min.175310 


P23150 

PI 4753 [ 


Transmembrane 
Transmembrane 


Epor 
lrs2 


Mouse gene for erythropoietin receptor. 
Mus musculus insulin receptor substrate-2 
(Irs2) gene- partial cds. 


92205 
94285 


Mm22564 


088970 
061857 ! 


Transmembrane 
Transmembrane 


H2-EM 
Tnfrsfl7 


histocompatibility 2. class 11 antigen E beta 
tumor necrosis factor receptor superfamily, 
member 17 . 


94190 
92527 


Mm.12935 
Mm.4294 


088472 
P51830 


Transmembrane 
Transmembrane 

Transmembrane 
Transport 


Adcy9 
Edgl 

Fzd4 

Vps35 

Hbb-b2 


adenylate cyclase 9 

endothelial differentiation sphingolipid G- 
protein-coupled receptor 1 
frizzled homoloE 4 (Drosophila) 
vacuolar protein sorting 35 
Mouse gene for beta-l-plobin. 


161788_f 

93459 s 

92640 

103534 


Mm.982 

Mm.68712 
Mm.196201 


09EOH3 | 
P02089 


Transport 
Transport 
Transport 
Transport 
Transport 

Zinc Finger 


Kpnbl 
Rab9 
Racl 
Rab33b 

Zfp216 


karvopherin (rmportin) beta 1 

RAB9, member RAS oncogene family 

RAS-related C3 botulinum substrate 1 
Mus musculus DNA for Rab33B, exon 2 

and complete cds. — 

zinc finger protein 216 _ 


93111 
95516 
101555 
103062 

160321 
160205 f 


Mm.16710 
Mm.25306 
Mm889 

Mm2904 
Mm.25228 


P70168 
O9R0M6 
P15154 
035963 j 

088878 
Q90YK7 


Zinc Finger 
Zinc Finger 
Zinc Finger 

Zinc Finger 


Rnfll 

Nbrl 

po) 

Gfilb 


ring finger protein 1 1 

next to the Brcal 

Mus musculus clone MIA14 full-length 
intracistemal A-particle gag protein gene, 
complete cds; and pol pseudogene, 

complete sequence. 

growth factor independent IB 


101484 
93907_f 

102260 
98098 


Mm.784 

Mm.10804 
Mm3471 


P97432 
PI 1365 

070237 
P13634 


Zinc Finger 


Carl 
Cul4a 

D7Wsul28e 

Rhced 
AU044919 


carbonic anhvdrase 1 

cullin 4A 

DNA segment, Cnr /, wayne oiaic 
University 128. expressed 

Rhesus blood group CE and D . 

expressed sequence AU044919 


104288 
103861_s 

103340 
102823 
102372 


Mm22276 
Mm.21103 

Mm.195461 
Mm.14438 
Mm.1192 


O9OX04 




Lisch7 

Igh-VJ558 

0910001L24 

Rik 

Txnip 


immunoglobulin joining chain 

liver-specific bHLH-Zip transcription factor 
immunoglobulin heavy chain. ( J558 family) 
RTKEN cDNA 0910001L24 gene 

thioredoxin interacting protein 


162274 f 
161486 f 
161243J 

160547 s 


Mm.4067 

Mm.157783 

Mm.22637 

Mm.77432 






Drl 

4933429H1S 
Rik 


down-regulator of transcription 1 
Mus musculus. Similar to translocation 
protein 1. clone IMAGE: 5347 105, mRNA, 
partial cds . 


160449 
160136J* 


Mm38184 
Mm.200980 






1500010B24 

Rik 

IgM 


\ RJJCEN cDNA 1500010B24 gene 

Mus castaneus IgK chain gene, C-region, 3 


1601 11 
102156 f 


Mm.65264 
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end. 










AA409749 


expressed sequence AA409749 


1 UU / HZ 


Mm 3628 






D2Ertd63e 


una segment, L-nr z, ukaiu uoi oj, 
expressed 


QSR69 
yjooz 


Mm 24965 






Igk-V28 


Mus muscuhis anti-HTV-1 reverse 
transcriptase single-chain variable fragment 
mRNA, complete cds 


1UUJZZ 


Mm 990154 






583043 1A10 
Rik 


jNjJsJbN cuna j5jU4jiaiu gene 


94136 


Mm 1 148 






IgKVI 


Mouse Ig active lambda- 1 -chain C-region 
gene, 3' end. 


93638 s 








fanap38 


immunity-associated protein, 38 kDa 


92489 


Mm 197478 

(Villi. 1 7 f*T # O 


P70224 ' 




923 1 6 f 


Mouse germlme Ig Iambda-2-chain C- 
region gene r 3* end. 


091 1 ft f 








2700007P21 
Rik 


RIKENcDNA2700007P2i gene 


92268 


Mm.3587 






104477 


ESTs 


104477 


Mm.29940 






0610012A05 
Rik 


RIKEN cDNA 06 100 12 AO 5 gene 


104206 


Mm./ /oiy 






Atp6sl 


Mus muscurus, clone MGC:3761 5 
IMAGE:4989784, mRNA, complete cds 


103699J 


Mm.222723 






Gbp3 


guanylate nucleotide binding protein 3 


103202 


Mm. 1909 






immunoglob 
ulin V region 


Mouse mRNA for immunoglobulin gamma- 
3 V-D-J region and secreted constant 
region, complete cds. 


102721 








AI256744 


Mus museums, clone IMAGE:3500612, 
mRNA, partial cds 


102233 


ivim.iUHj 






Ptdssl 


phosphatidylserine synthase 1 


101931 


Mm.9440 


055024 




Ggal 


golgi associated, gamma adaptin ear 
containing, ARF binding protein 1 


98445 


Mm.34525 






4121402D02 
Rik 


RIKEN cDNA 4121402D02 gene 


97935 


Mm.30252 






IiEP 


interferon-inducible GTPase 


96764 








2310022K15 
Rik 


RIKEN cDNA 2310022K15 gene 


95622 


Mm.28047 






Vcl 


vinculin 


94963 


Mm.12842 






2610319K07 
Rik 


RIKEN cDNA 26103 19K07 gene 


104744 


»x„ inn/no 
Mm.zU04/y 






Iga 


Mouse Ig germline D-J-C region alpha gene 
and secreted tail; Mouse germ line gene for 
immunoglobulin alpha H constant part 
(coding for the last three exons) 


100583 








PrpfS 


pre-mRNA processing factor 8 


VOJ /** 


Mm 19S9 






Scotin 


scotin gene 


os in? 

7 J IUZ 


Mm 196S33 






U10035L05 
KlK 


RIKEN cDNA 1 1 J0Q35U15 gene 


7JUDZ 


Mm 90140 






3I10001A13 

KlK 


RIKEN cDNA 3 1 1000 1 A 1 3 gene 


96640 


Mrru200627 






Vps26 


vacuolar protein sorting 26 (yeast) 


96665 


Mm.27373 







mu- 

immunoglob 
ulin 


Mouse germ line gene fragment for mu- 
immunoglobulin C-terminus (secreted 
form). 


93583 s 








H19 


M. muscurus HI9 mRNA. > 


93028 




061638 




Car2 


carbonic anhydrase 2 


92642 


Mm. 1186 






Rae! 


RAE1 RNA export 1 nomolog lb. pornoe) 




Mm 41 It 








microtubule-associated protein 1 light chain 
3 


160288 


Mm.28357 






1700008C22 
Rik 


RIKEN cDNA 1700008C22 gene 


160123 


Mm. 177990 






98254_f 


un98fD6.xl NCl_CGAPJvlam6 Mus 
musculus cDNA clone IMAGE:2581955 3' 
similar to gb:M 10062 Mouse IgE-binding 
factor mRNA, complete cds (MOUSE); 
mRNA sequence. 


98254_f 








Ecf2 


eukaryotic translation elongation factor 2 


97559 


Mm.27818 


061509 
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Igk-V28 


immunoglobulin kappa chain variable 28 
(V28) 




Mm 104747 






9030022E12 


RKEN cDNA 9030022E12 gene 




Mm 0 






D18362 


expressed sequence D 183 62 


lU\5ZUO 


Mm OH 547 






Heyl 


Mus musculus 6 days neonate head cDNA, 
RIKEN full-length enriched library, 
clone:5430408Kl 1 mairy/enhancer-of-split 
related with YRPW motif 1 , iull msert 
sequence 


101913 


Mm.222825 






shim 


shroom 


i wuz** 


Mm df,Ci\d 






AW547365 


expressed sequence AW547365 


7 IHZJ 


Mm 7fini *» 
lYl TO. J UU 1 D 






D8Ertd69e 


DNA segment, Chr 8, ERATO Doi 69, 
expressed 


y**yzz_i 


Mi« o/;fino 

lYlin.ZOOUy 






Frapl 


FK506 binding protein 12-rapamycin 
associated protein 1 


tn<nnQ 

l\jH /{JO 


Mm 91 1 *?R 







4933434E20 
Rik 


RIKEN cDNA 4933434E20 gene 


104038 


Mra.21451 






1810009A16 
Rik 


RIKEN cDNA 1810009A16 gene 


104041 


Mro.21458 






Pexl la 


peroxisomal biogenesis factor 1 la 


1 03660 


Mmzuol j 






AU044919 


expressed sequence AU044919 


102824 g 


Mm. 144 Jo 






MGC29044 


hypothetical protein MGC29044 


102375 


Mm. 11 96 






Mkral 


makorin, ring ringer protein, 1 


101070 


Mm.7198 






LOC207933 


similar to Isopentenyl-diphosphate delta- 
isomerase (IPP isomerase) (Isopentenyl 
pyrophosphate isomerase) 


96269 


Mm.29847 






Elp3 


elongation protein 3 homolog (S. 
ccrevisiae) 


95717 


Mm.29719 






Addl 


adducin 1 (alpha) 


94535 


Mm.29052 






Pbcf 


pre-B-cell colony-enhancing factor 


94461 


Mm.28830 






4930588AI8 
Rik 


Mus musculus, clone IMAGE :4457493, 
mRNA 


96717 


Mm.233830 






Dadl 


Mus musculus Defender against Apoptotic 
Death (Dadl) gene, exon 3. 


96008 








2410015A15 
Rik 


RIKEN cDNA 24I0015A15 gene 


95433 


Mm.24495 






Xbpl 


X-box binding protein 1 


94821 


Mm.22718 






Net I 


neuroepithelial cell transforming gene 1 


94223 


Mm.22261 


09Z1L7 




Igk-V28 


immunoglobulin kappa chain variable 28 
(V28) 


93086 


Mm.104747 






LOC2 18490 


similar to Transcription factor BTF3 (RNA 
polymerase B transcription factor 3) 


93057 


Mm.1538 






Lamcl 


laminin, gamma 1 


161706 f 


Mm. 1249 






A1450287 


expressed sequence AI4 50287 


161596 f 


Mm.222827 






Sepl5 


15-kDa selenoprotein 


160360 


Mm.29812 






LOC229906 


similar to TRANSCRIPTION INITIATION 
FACTOR IIB (TFIIB) (RNA 
POLYMERASE II ALPHA INITIATION 
FACTOR) 


160225 


Mm.27213 






2810043003 
Rik 


RIKEN cDNA 28 10043003 gene 


98756 


Mm.45532 






96532 


: ~. — —■ — 

ESTs, Highly similar to nucleolar protein 

GU2 fMus musculus"! fM.musculusl 


Q6519 


Mm 






Mytll 


myelin transcription factor 1 -like 


96495 


Mm.2523 


P97500 




2010004A03 
Rik 


RDCEN cDNA 20I0004A03 gene 


94802 


Mm.35302 






C79248 


expressed sequence C79248 


94689 


Mm.153895 






Mylk 


myosin, light polypeptide kinase 


93482 


Mm.27680 






DlErtdl47e 


DNA segment, Chr 1, ERATO I?oi 147, 
expressed 


93191 


Mm.5572 






R75364 


expressed sequence R75364 


92397 


Mm.89393 






92245 


ESTs, Highly similar to nucleolar protein 
GU2 [Mus musculus] [Mjnusculus] 


92245 


Mm.35019 
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Ctse 


Mus musculus cathepsin E gene, exon 1 , 
partial. 


104696 








AA420392 


expressed sequence AA420392 


104670 


Mm.32357 






Acyp2 


acylphosphatase 2, muscle type 


104258 


Mm.28407 






Lrba 


LPS-responsive beige-like anchor 


104264 


Mm.28458 






Dock2 


dedicator of cyto-kinesis 2 


103462 


Mm.2173 






Gabpa 


GA repeat binding protein, alpha 


103440 


Mm. 18974 






Nripl 


nuclear receptor interacting protein 1 


103288 


Mm.20895 


Q9Z2K2 




AI225904 


expressed sequence AI225904 


103200 


Mm. 1902 






98438 f 


Mouse Q4 class 1 MHC gene (exon 5). 


98438 f 




Q31220 




2Q10012D11 
Rik 


RDCEN cDNA 20 1001 2D 1 1 gene 


96231 


Mm 140243 






AU019574 


Mus musculus, Similar to hypothetical 
protein FLI1 1110, clone MGC:1 1734 
IMAGE:3968418, mRNA, complete cds 


96172 


Mm.28395 






91304I5E20 
Rik 


RIKEN cDNA 9130415E20 gene 


95020 


Mm.40620 






95021 


Mus musculus, clone LMAGE:4502890, 
mRNA 


95021 


Mm.27476 






AW495846 


expressed sequence AW495846 


104549 


Mm.23702 






Gtpbp2 


GTP binding protein 2 


104144 


Mm.22 1 47 






2310050NI 1 
Rik 


RIKEN cDN A 23 1 0050N 1 1 gene 


104114 


Mm 919S4 






OnndB 


ORMl-like 3 (S. cerevisiae) 


98065 


Mm. 180546 






2610003J05 
Rik 


RIKEN cDNA 26 10003 JOS gene 


97491 


MmJ105I 






Map 17 


membrane-associated protein 17 


96935 


Mm.30181 






Gabarapl2 


GABA(A) receptor-associated protein like 2 


96840 


Mm 10017 

1V11 11. J\J \J I / 






2310050K10 
Rik 


RIKEN cDNA 2310050K10 gene 


95743 


Mm.29769 






All 82287 


expressed sequence All 82287 


94469 


Mm.28848 






Nudel 


nuclear distribution gene E-like 


98884 r 


Mm.31979 






Cpnel 


copine I 


97199 


Mm.27660 






Dnajb9 


DnaJ (Hsp40) homolog, subfamily B, 
member 9 


96680 


Min:27432 






95488 


Mus musculus, clone IMAGE:3597827, 
mRNA, partial cds 


95488 


Mm.25018 






2700059C12 
Rik 


RIKEN cDNA 2700059C12 gene 


93312 


Mm.18485 






Sdcbp 


syndecan binding protein 


93017 


Mm.14744 


088601 
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Table 7. Tansmembrane Proteins Enriched in Mouse HSCs 



Classification 


Description 


surface antigen 


Histocompatibility 2, class II antigen E beta 


receptor 


Garrona-arninobutyric acid (GABA) B receptor, 1 


oncogene 


Myeloproliferative leukemia virus oncogene (TPOR) 


surface antigen 


Histocompatibility 2, class 11 antigen A alpha 




Cytotoxic T lymphocyte-associated protein 2 beta 


receptor 


Erythropoietin receptor . . — 


oncogene 


Kit oncogene 




Coagulation factor 11 (thrombin) receptor 




Frizzled homolog 4 (Drosophila) 




Membrane-associated protein 17 


surface glycoprotein 


ESTs similar to C2 1 1 Human putative surface glycoprotein 



Table 8. Transcription Factors Upregulated in Mouse HSCs 



Symbol [ 

Kl£2 

Nmycl 
Zfxlha 


Description 

Kmppel-like factor 2 (lung) 
neuroblastoma myc-related 

oncogene 1 

zinc finger homeobox la 


Fold change 
44.9 

10.4 
10.4 
9.0 


Accession No. 

NM 008452 

NM 008709 
NM 011546 
NM 008091 


Gata3 
Ten 5 

Tall 


GATA-binding protein 3 

transcription factor 15 
T-cell acute lymphocytic 

leukemia 1 . 


8.6 

83 
7.2 


NM 009328 

NM 011527 
NM 008268 


Hoxb5 
Meisl 


homeo box B5 
myeloid ecotropic viral 
integration site 1 


7.1 


NM 010789 


Pbx3b 


Mus musculus transcription 
factor PBX3b 


6.5 


AF020200 


Cited2 


Cbp/p300-interacting 
transactivator 2 


5.2 
3.6 


NM 010828 
none 


AtG 
Pbxl 


activating transcription factor 2 
pre B-cell leukemia 
transcription factor 1 


4.7 
45 


NM 008783 
Mm.24637 


None 

None 
BtD 


chromatin remodeling factor 
EST similar to PRE-MRNA 
SPLICING FACTOR SRP20 
basic transcription factor 3 


3.4 
3.2 
2.7 


Mm.29915 
none 

NM 011544 


Tcfl2 

Madh7 .. 
Hhex 


transcription factor 12 
MAD homolog 7 (Drosophila) 
hematopoietically expressed 
homeobox 


2.7 
IS 


NM 008543 
NM 008245 
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Example 5. Hierarchical Clustering Analysis of Differential Exp ressed Genes 

This Example describes study aimed at determining if genes differentially 
expressed with the HSC compartment are also expressed in other tissues. To perform this 
analysis we compared the gene expression levels of 210 differentially expressed HSC genes 
with a database composed of 45 normal tissue. Hierarchical clustering of these data was 
used to group both those tissues and genes with similar expression patterns. The three HSC 
cell subsets formed a distinct branch in this analysis, with LTR-enriched 38 + 34~ cells 
forming a discrete branch compared to the STR cells (38 + 34 + and 38'34 + ). This clustering 
pattern is consistent with the stem cell activity pattern within the three subsets. Importantly, 
the HSC samples do not cluster near the bone or bone marrow samples suggesting that the 
differentially expressed HSC genes are not bone marrow related. This analysis also showed 
that the majority of these genes were not ubiquitously expressed although most were 
expressed at comparable levels in at least one other tissue. 

Three of the genes were found to have their peak expression within the HSC 
compartment. These were the scaffolding protein Gabl (GRB2-asssociated binding protein 
1) and the uncharacterized gene A430017F18 which displayed the highest level expression 
in the LTR enriched CD38 + CD34" cells, and the Pdgfrb gene (platelet derived growth factor 
receptor, beta polypeptide) which peaked within the 38 + 34 + STR HSC subset. Although the 
majority of these genes are also expressed at comparable levels in other tissues it is 
important to note that in many cases the level of expression in HSC subsets was at or near 
the peak expression determined for these genes across the entire 45 tissue panel The high 
relative expression within HSCs of this subset of genes indicates that they likely to play an 
important role in the biology of HSCs. 

*** 

It is understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview of 
this application and scope of the appended claims. Although any methods and materials 
similar or equivalent to those described herein can be used in the practice or testing of the 
present invention, the preferred methods and materials are described. 
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All publications, GenBank sequences, patents and patent applications cited 
herein are hereby expressly incorporated by reference in their entirety and for all purposes 
if each is individually so denoted. 
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WE CLAIM: 

1. A method for inhibiting differentiation of mammalian stem cells, 
comprising (a) providing a population of stem cells, (b) introducing a vector comprising an 
HSC differentiation-inhibiting polynucleotide sequence shown in Table 1 and Table 4 into 
the stem cells, and (c) expressing a polypeptide encoded by the polynucleotide by culturing 
the modified stem cells, thereby inhibiting differentiation of the stem cells. 

2. The method of claim 1 , wherein the population of stem cells are isolated 
from bone marrow. 

3. The method of claim 1 , wherein the stem cells are human hematopoietic 

stem cells. 

4. The method of claim 3, wherein the stem cells are first selected for 
expression of CD34 and Thy prior to introducing the vector. 

5. The method of claim 1 , wherein the stem cells are mouse hematopoietic 

stem cells. 

6. The method of claim 5, wherein the stem cells are first selected for 
expression of CD38 and lack of expression of CD34 prior to introducting the vector. 

7. The method of claim 1, wherein the HSC differentiation-inhibiting 
polynucleotide encodes GATA-binding protein 3 (Gata3) or ID3. 

8. A method for increasing the effective dose of hematopoietic stem cells in 
a mammalian subject, comprising (a) providing a population of hematopoietic stem cells, (b) 
introducing into the cells an HSC differentiation-inhibiting polynucleotide selected from 
Table 1 and Table 4, and (c) administering the genetically modified cells that express an 
HSC differentiation-inhibiting polypeptide to a mammalian subject; thereby increasing the 
effective dose of hematopoietic stem cells in the subject. 
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9. The method of claim 8, wherein the administered stem cells are a 
subpopulation of the modified cells that are selected for expression of the polypeptide prior 
to administering to the subject. 

10. The method of claim 8, wherein the administered stem cells overexpress 
the HSC differentiation-inhibiting polypeptide. 

11. The method of claim 8, wherein the hematopoietic stem cells are obtained 
from bone marrow. 

12. The method of claim 8, wherein the subject is human, and the 
hematopoietic stem cells are human hematopoietic stem cells. 

13. The method of claim 12, wherein the hematopoietic stem cells are 
selected for expression of CD38 and Thy prior to introduction of the HSC differentiation- 
inhibiting polynucleotide. 

14. The method of claim 8, wherein an expression vector comprising the HSC 
differentiation-inhibiting polynucleotide is introduced into the cells. 

15. A method for inhibiting hematopoietic stem cell differentiation, 
comprising contacting a population of HSCs with an effective amount of an HSC 
differentiation-inhibiting polypeptide selected from Tables 1 and 4, thereby inhibiting 
differentiation of the HSCs. 

1 6. The method of claim 1 5, wherein the HSCs are present in an in vitro cell 

culture. 

17. The method of claim 15, wherein the HSCs are present in a subject 
grafted with the HSCs. 

18. The method of claim 1 5, wherein the subject is human, and the HSC 
differentiation-inhibiting polypeptide is selected from the group shown in Table 2. 
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19. A method for isolating a population of cells that are enriched for 
hematopoietic stem cells (HSCs), the method comprising (a) obtaining a sample of cells 
containing hematopoietic stem cells, (b) selecting cells from the sample based on expression 
or lack of expression of at least one known HSC surface marker, and at least one molecule 
shown in Table 2 and Table 7 and (c) separating cells with the known HSC marker and at 
least one of the molecules shown in Table 2 and Table 7 thereby isolating a population of 
human cells enriched for hematopoietic stem cells. 

20. The method of claim 1 9, wherein the hematopoietic stem cells are human 

HSCs. 

21. The method of claim 20, wherein the known HSC marker is CD34 + and 

Thy\ 

22. The method of claim 20, wherein the at least one molecule is a surface 
molecule shown in Table 2. 

23. The method of claim 19, wherein the hematopoietic stem cells are mouse 

HSCs. 

24. The method of claim 23, wherein the known HSC marker is CD38 + and 

CD34\ 

25. The method of claim 23, wherein the isolated population of cells are also 
selected for expression of c-kit and Sca-1 but lack of expression of Lin. 

26. The method of claim 19, wherein the sample of cells are obtained from 
bone marrow. 

27. A method of enumerating hematopoietic stem cells in a population of 
cells, comprising (a) contacting the population of cells with an antibody that specifically 
binds to one HSC surface marker shown in Table 2 and Table 7 under conditions which 
allow the antibody to specifically bind to the HSC surface marker; and (b) quantifying the • 



52 



WO 2004/071443 



PCT/US2004/004007 



cells recognized by the antibody; thereby enumerating hematopoietic stem cells in the 
population of cells. 

28. The method of claim 27, wherein the population of cells is a mixture of 
hematopoietic cells. 

29. The method of claim 27, wherein hematopoietic stem cells are human 
HSCs, and the population of cells are first selected for expression of CD34 and Thy prior to 
the contacting. 

30. The method of claim 27, wherein hematopoietic stem cells are mouse 
HSCs, and the population of cells are first selected for expression of CD38 but lack of 
expression of CD34 prior to the contacting. 
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